Difference between just reshaping and reshaping and getting transpose? - python

I'm currently studying CS231 assignments and I've realized something confusing. When calculating gradients, when I first reshape x then get transpose I got the correct result.
x_r=x.reshape(x.shape[0],-1)
dw= x_r.T.dot(dout)
However, when I reshape directly as the X.T shape it doesn't return the correct result.
dw = x.reshape(-1,x.shape[0]).dot(dout)
Can someone explain the following question?
How does the order of getting elements with np.reshape() change?
How reshaping (N,d1,d2..dn) shaped array into N,D array differs from getting a reshaped array of (D,N) with its transpose.

While both your approaches result in arrays of same shape, there will by a difference in the order of elements due to the way numpy reads / writes elements. By default, reshape uses a C-like index order, which means the elements are read / written with the last axis index changing fastest, back to the first axis index changing slowest (taken from the documentation).
Here is an example of what that means in practice. Let's assume the following array x:
x = np.asarray([[[1, 2], [3, 4], [5, 6]], [[7, 8], [9, 10], [11, 12]]])
print(x.shape) # (2, 3, 2)
print(x)
# output
[[[ 1 2]
[ 3 4]
[ 5 6]]
[[ 7 8]
[ 9 10]
[11 12]]]
Now let's reshape this array the following two ways:
opt1 = x.reshape(x.shape[0], -1)
opt2 = x.reshape(-1, x.shape[0])
print(opt1.shape) # outptu: (2, 6)
print(opt2.shape) # output: (6, 2)
print(opt1)
# output:
[[ 1 2 3 4 5 6]
[ 7 8 9 10 11 12]]
print(opt2)
# output:
[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]
[11 12]]
reshape first inferred the shape of the new arrays and then returned a view where it read the elements in C-like index order.
To exemplify this on opt1: since the original array x has 12 elements, it inferred that the new array opt1 must have a shape of (2, 6) (because 2*6=12). Now, reshape returns a view where:
opt1[0][0] == x[0][0][0]
opt1[0][1] == x[0][0][1]
opt1[0][2] == x[0][1][0]
opt1[0][3] == x[0][1][1]
opt1[0][4] == x[0][2][0]
opt1[0][5] == x[0][2][1]
opt1[1][0] == x[1][0][0]
...
opt1[1][5] == x[1][2][1]
So as described above, the last axis index changes fastest and the first axis index slowest. In the same way, the output for opt2 will be computed.
You can now verify that transposing the first option will result in the same shape but a different order of elements:
opt1 = opt1.T
print(opt1.shape) # output: (6, 2)
print(opt1)
# output:
[[ 1 7]
[ 2 8]
[ 3 9]
[ 4 10]
[ 5 11]
[ 6 12]]
Obviously, the two approaches do not result in the same array due to element ordering, even though they will have the same shape.

Related

Concatenate fails in simple example

I am trying the simple examples of this page
In it it says:
arr=np.array([4,7,12])
arr1=np.array([5,9,15])
np.concatenate((arr,arr1))
# Must give array([ 4, 7, 12, 5, 9, 15])
np.concatenate((arr,arr1),axis=1)
#Must give
#[[4,5],[7,9],[12,15]]
# but it gives *** numpy.AxisError: axis 1 is out of bounds for array of dimension 1
Why is this example not working?
np.vstack is what you're looking for. Note the transpose at the end, this converts vstack's 2x3 result to a 3x2 array.
import numpy as np
arr = np.array([4,7,12])
arr1 = np.array([5,9,15])
a = np.vstack((arr,arr1)).T
print(a)
Output:
[[ 4 5]
[ 7 9]
[12 15]]

Numpy array add inplace, values not summed when select same row multiple times

suppose I have a 2x2 matrix, I want to select a few rows and add inplace with another array of the correct shape. The problem is, when a row is selected multiple times, the values from another array is not summed:
Example:
I have a 2x2 matrix:
>>> import numpy as np
>>> x = np.arange(15).reshape((5,3))
>>> print(x)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
I want to select a few rows, and add values:
>>> x[np.array([[1,1],[2,3]])] # row 1 is selected twice
[[[ 3 4 5]
[ 3 4 5]]
[[ 6 7 8]
[ 9 10 11]]]
>>> add_value = np.random.randint(0,10,(2,2,3))
[[[6 1 2] # add to row 1
[9 8 5]] # add to row 1 again!
[[5 0 5] # add to row 2
[1 9 3]]] # add to row 3
>>> x[np.array([[1,1],[2,3]])] += add_value
>>> print(x)
[[ 0 1 2]
[12 12 10] # [12,12,10]=[3,4,5]+[9,8,5]
[11 7 13]
[10 19 14]
[12 13 14]]
as above, the first row is [12,12,10], which means [9,8,5] and [6,1,2] is not summed when added onto the first row. Are there any solutions? Thanks!
This behavior is described in the numpy documentation, near the bottom of this page, under "assigning values to indexed arrays":
https://numpy.org/doc/stable/user/basics.indexing.html#basics-indexing
Quoting:
Unlike some of the references (such as array and mask indices) assignments are always made to the original data in the array (indeed, nothing else would make sense!). Note though, that some actions may not work as one may naively expect. This particular example is often surprising to people:
>>> x = np.arange(0, 50, 10)
>>> x
array([ 0, 10, 20, 30, 40])
>>> x[np.array([1, 1, 3, 1])] += 1
>>> x
array([ 0, 11, 20, 31, 40])
Where people expect that the 1st location will be incremented by 3. In fact, it will only be incremented by 1. The reason is that a new array is extracted from the original (as a temporary) containing the values at 1, 1, 3, 1, then the value 1 is added to the temporary, and then the temporary is assigned back to the original array. Thus the value of the array at x[1] + 1 is assigned to x[1] three times, rather than being incremented 3 times.
Just wanna share what #hpaulj suggests that uses np.add.at:
>>> import numpy as np
>>> x = np.arange(15).reshape((5,3))
>>> select = np.array([[1,1],[2,3]])
>>> add_value = np.array([[[6,1,2],[9,8,5]],[[5,0,5],[1,9,3]]])
>>> np.add.at(x, select.flatten(), add_value.reshape(-1, add_value.shape[-1]))
[[ 0 1 2]
[18 13 12]
[11 7 13]
[10 19 14]
[12 13 14]]
Now the first row is [18,13,12] which is the sum of [3,4,5], [6,1,2] and [9,8,5]

Subtracting minimum of row from the row

I know that
a - a.min(axis=0)
will subtract the minimum of each column from every element in the column. I want to subtract the minimum in each row from every element in the row. I know that
a.min(axis=1)
specifies the minimum within a row, but how do I tell the subtraction to go by rows instead of columns? (How do I specify the axis of the subtraction?)
edit: For my question, a is a 2d array in NumPy.
Assuming a is a numpy array, you can use this:
new_a = a - np.min(a, axis=1)[:,None]
Try it out:
import numpy as np
a = np.arange(24).reshape((4,6))
print (a)
new_a = a - np.min(a, axis=1)[:,None]
print (new_a)
Result:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
[[0 1 2 3 4 5]
[0 1 2 3 4 5]
[0 1 2 3 4 5]
[0 1 2 3 4 5]]
Note that np.min(a, axis=1) returns a 1d array of row-wise minimum values.
We than add an extra dimension to it using [:,None]. It then looks like this 2d array:
array([[ 0],
[ 6],
[12],
[18]])
When this 2d array participates in the subtraction, it gets broadcasted into a shape of (4,6), which looks like this:
array([[ 0, 0, 0, 0, 0, 0],
[ 6, 6, 6, 6, 6, 6],
[12, 12, 12, 12, 12, 12],
[18, 18, 18, 18, 18, 18]])
Now, element-wise subtraction happens between the two (4,6) arrays.
Specify keepdims=True to preserve a length-1 dimension in place of the dimension that min collapses, allowing broadcasting to work out naturally:
a - a.min(axis=1, keepdims=True)
This is especially convenient when axis is determined at runtime, but still probably clearer than manually reintroducing the squashed dimension even when the 1 value is fixed.
If you want to use only pandas you can just apply a lambda to every column using min(row)
new_df = pd.DataFrame()
for i, col in enumerate(df.columns):
new_df[col] = df.apply(lambda row: row[i] - min(row))

Numpy sorted array for loop error but original works fine

After I sort this numpy array and remove all duplicate (y) values and the corresponding (x) value for the duplicate (y) value, I use a for loop to draw rectangles at the remaining coordinates. yet I get the error : ValueError: too many values to unpack (expected 2), but its the same shape as the original just the duplicates have been removed.
from graphics import *
import numpy as np
def main():
win = GraphWin("A Window", 500, 500)
# starting array
startArray = np.array([[2, 1, 2, 3, 4, 7],
[5, 4, 8, 3, 7, 8]])
# the following reshapes the from all x's in one row and y's in second row
# to x,y rows pairing the x with corresponding y value.
# then it searches for duplicate (y) values and removes both the duplicate (y) and
# its corresponding (x) value by removing the row.
# then the unique [x,y]'s array is reshaped back to a [[x,....],[y,....]] array to be used to draw rectangles.
d = startArray.reshape((-1), order='F')
# reshape to [x,y] matching the proper x&y's together
e = d.reshape((-1, 2), order='C')
# searching for duplicate (y) values and removing that row so the corresponding (x) is removed too.
f = e[np.unique(e[:, 1], return_index=True)[1]]
# converting unique array back to original shape
almostdone = f.reshape((-1), order='C')
# final reshape to return to original starting shape but is only unique values
done = almostdone.reshape((2, -1), order='F')
# print all the shapes and elements
print("this is d reshape of original/start array:", d)
print("this is e reshape of d:\n", e)
print("this is f unique of e:\n", f)
print("this is almost done:\n", almostdone)
print("this is done:\n", done)
print("this is original array:\n",startArray)
# loop to draw a rectangle with each x,y value being pulled from the x and y rows
# says too many values to unpack?
for x,y in np.nditer(done,flags = ['external_loop'], order = 'F'):
print("this is x,y:", x,y)
print("this is y:", y)
rect = Rectangle(Point(x,y),Point(x+4,y+4))
rect.draw(win)
win.getMouse()
win.close()
main()
here is the output:
line 42, in main
for x,y in np.nditer(done,flags = ['external_loop'], order = 'F'):
ValueError: too many values to unpack (expected 2)
this is d reshape of original/start array: [2 5 1 4 2 8 3 3 4 7 7 8]
this is e reshape of d:
[[2 5]
[1 4]
[2 8]
[3 3]
[4 7]
[7 8]]
this is f unique of e:
[[3 3]
[1 4]
[2 5]
[4 7]
[2 8]]
this is almost done:
[3 3 1 4 2 5 4 7 2 8]
this is done:
[[3 1 2 4 2]
[3 4 5 7 8]]
this is original array:
[[2 1 2 3 4 7]
[5 4 8 3 7 8]]
why would the for loop work for the original array but not this sorted one?
or what loop could I use to just use (f) since it is sorted but shape(-1,2)?
I also tried a different loop:
for x,y in done[np.nditer(done,flags = ['external_loop'], order = 'F')]:
Which seems to fix the too many values error but I get:
IndexError: index 3 is out of bounds for axis 0 with size 2
and
FutureWarning: Using a non-tuple sequence for multidimensional indexing is
deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this
will be interpreted as an array index, `arr[np.array(seq)]`, which will
result either in an error or a different result.
for x,y in done[np.nditer(done,flags = ['external_loop'], order = 'F')]:
which I've looked up on stackexchange to fix but keep getting the error regardless of how I do the syntax.
any help would be great thanks!
I don't have the graphics package (it might be a windows specific thing?), but I do know that you're making this waaaaay too complicated. Here is a much simpler version that produces the same done array:
from graphics import *
import numpy as np
# starting array
startArray = np.array([[2, 1, 2, 3, 4, 7],
[5, 4, 8, 3, 7, 8]])
# searching for duplicate (y) values and removing that row so the corresponding (x) is removed too.
done = startArray.T[np.unique(startArray[1,:], return_index=True)[1]]
for x,y in done:
print("this is x,y:", x, y)
print("this is y:", y)
rect = Rectangle(Point(x,y),Point(x+4,y+4))
rect.draw(win)
To note, in the above version done.shape==(5, 2) instead of (2, 5), but you can always change that back after the for loop with done = done.T.
Here's some notes on your original code for future reference:
The order flag in reshape is completely superfluous to what your code is trying to do, and just makes it more confusing/potentially more buggy. You can do all of the reshapes you wanted to do without it.
The use case for nditer is to iterate over the individual elements of one (or more) array one at a time. It cannot in general be used to iterate over the rows or columns of a 2D array. If you try to use it this way, you're likely to get buggy results that are highly dependent on the layout of the array in memory (as you saw).
To iterate over rows or columns of a 2D array, just use simple iteration. If you just iterate over an array (eg for row in arr:), you get each row, one at a time. If you want the columns instead you can transpose the array first (like I did in the code above with .T).
Note about .T
.T takes the transpose of an array. For example, if you start with
arr = np.array([[0, 1, 2, 3],
[4, 5, 6, 7]])
then the transpose is:
arr.T==np.array([[0, 4],
[1, 5],
[2, 6],
[3, 7]])

tensorflow: how to interleave columns of two tensors (e.g. using tf.scatter_nd)?

I've read the tf.scatter_nd documentation and run the example code for 1D and 3D tensors... and now I'm trying to do it for a 2D tensor. I want to 'interleave' the columns of two tensors. For 1D tensors, one can do this via
'''
We want to interleave elements of 1D tensors arr1 and arr2, where
arr1 = [10, 11, 12]
arr2 = [1, 2, 3, 4, 5, 6]
such that
desired result = [1, 2, 10, 3, 4, 11, 5, 6, 12]
'''
import tensorflow as tf
with tf.Session() as sess:
updates1 = tf.constant([1,2,3,4,5,6])
indices1 = tf.constant([[0], [1], [3], [4], [6], [7]])
shape = tf.constant([9])
scatter1 = tf.scatter_nd(indices1, updates1, shape)
updates2 = tf.constant([10,11,12])
indices2 = tf.constant([[2], [5], [8]])
scatter2 = tf.scatter_nd(indices2, updates2, shape)
result = scatter1 + scatter2
print(sess.run(result))
(aside: is there a better way to do this? I'm all ears.)
This gives the output
[ 1 2 10 3 4 11 5 6 12]
Yay! that worked!
Now lets' try to extend this to 2D.
'''
We want to interleave the *columns* (not rows; rows would be easy!) of
arr1 = [[1,2,3,4,5,6],[1,2,3,4,5,6],[1,2,3,4,5,6]]
arr2 = [[10 11 12], [10 11 12], [10 11 12]]
such that
desired result = [[1,2,10,3,4,11,5,6,12],[1,2,10,3,4,11,5,6,12],[1,2,10,3,4,11,5,6,12]]
'''
updates1 = tf.constant([[1,2,3,4,5,6],[1,2,3,4,5,6],[1,2,3,4,5,6]])
indices1 = tf.constant([[0], [1], [3], [4], [6], [7]])
shape = tf.constant([3, 9])
scatter1 = tf.scatter_nd(indices1, updates1, shape)
This gives the error
ValueError: The outer 1 dimensions of indices.shape=[6,1] must match the outer 1
dimensions of updates.shape=[3,6]: Dimension 0 in both shapes must be equal, but
are 6 and 3. Shapes are [6] and [3]. for 'ScatterNd_2' (op: 'ScatterNd') with
input shapes: [6,1], [3,6], [2].
Seems like my indices is specifying row indices instead of column indices, and given the way that arrays are "connected" in numpy and tensorflow (i.e. row-major order), does that mean
I need to explicitly specify every single pair of indices for every element in updates1?
Or is there some kind of 'wildcard' specification I can use for the rows? (Note indices1 = tf.constant([[:,0], [:,1], [:,3], [:,4], [:,6], [:,7]]) gives syntax errors, as it probably should.)
Would it be easier to just do a transpose, interleave the rows, then transpose back?
Because I tried that...
scatter1 = tf.scatter_nd(indices1, tf.transpose(updates1), tf.transpose(shape))
print(sess.run(tf.transpose(scatter1)))
...and got a much longer error message, that I don't feel like posting unless someone requests it.
PS- I searched to make sure this isn't a duplicate -- I find it hard to imagine that someone else hasn't asked this before -- but turned up nothing.
This is pure slicing but I didn't know that syntax like arr1[0:,:][:,:2] actually works. It seems it does but not sure if it is better.
This may be the wildcard slicing mechanism you are looking for.
arr1 = tf.constant([[1,2,3,4,5,6],[1,2,3,4,5,7],[1,2,3,4,5,8]])
arr2 = tf.constant([[10, 11, 12], [10, 11, 12], [10, 11, 12]])
with tf.Session() as sess :
sess.run( tf.global_variables_initializer() )
print(sess.run(tf.concat([arr1[0:,:][:,:2], arr2[0:,:] [:,:1],
arr1[0:,:][:,2:4],arr2[0:, :][:, 1:2],
arr1[0:,:][:,4:6],arr2[0:, :][:, 2:3]],axis=1)))
Output is
[[ 1 2 10 3 4 11 5 6 12]
[ 1 2 10 3 4 11 5 7 12]
[ 1 2 10 3 4 11 5 8 12]]
So, for example,
arr1[0:,:] returns
[[1 2 3 4 5 6]
[1 2 3 4 5 7]
[1 2 3 4 5 8]]
and arr1[0:,:][:,:2] returns the first two columns
[[1 2]
[1 2]
[1 2]]
axis is 1.
Some moderators might have regarded my question as a duplicate of this one, not because the questions are the same, but only because the answers contain parts one can use to answer this question -- i.e. specifying every index combination by hand.
A totally different method would be to multiply by a permutation matrix as shown in the last answer to this question. Since my original question was about scatter_nd, I'm going to post this solution but wait to see what other answers come in... (Alternatively, I or someone could edit the question to make it about reordering columns, not specific to scatter_nd --EDIT: I have just edited the question title to reflect this).
Here, we concatenate the two different arrays/tensors...
import numpy as np
import tensorflow as tf
sess = tf.Session()
# the ultimate application is for merging variables which should be in groups,
# e.g. in this example, [1,2,10] is a group of 3, and there are 3 groups of 3
n_groups = 3
vars_per_group = 3 # once the single value from arr2 (below) is included
arr1 = 10+tf.range(n_groups, dtype=float)
arr1 = tf.stack((arr1,arr1,arr1),0)
arr2 = 1+tf.range(n_groups * (vars_per_group-1), dtype=float)
arr2 = tf.stack((arr2,arr2,arr2),0)
catted = tf.concat((arr1,arr2),1) # concatenate the two arrays together
print("arr1 = \n",sess.run(arr1))
print("arr2 = \n",sess.run(arr2))
print("catted = \n",sess.run(catted))
Which gives output
arr1 =
[[10. 11. 12.]
[10. 11. 12.]
[10. 11. 12.]]
arr2 =
[[1. 2. 3. 4. 5. 6.]
[1. 2. 3. 4. 5. 6.]
[1. 2. 3. 4. 5. 6.]]
catted =
[[10. 11. 12. 1. 2. 3. 4. 5. 6.]
[10. 11. 12. 1. 2. 3. 4. 5. 6.]
[10. 11. 12. 1. 2. 3. 4. 5. 6.]]
Now we build the permutation matrix and multiply...
start_index = 2 # location of where the interleaving begins
# cml = "column map list" is the list of where each column will get mapped to
cml = [start_index + x*(vars_per_group) for x in range(n_groups)] # first array
for i in range(n_groups): # second array
cml += [x + i*(vars_per_group) for x in range(start_index)] # vars before start_index
cml += [1 + x + i*(vars_per_group) + start_index \
for x in range(vars_per_group-start_index-1)] # vars after start_index
print("\n cml = ",cml,"\n")
# Create a permutation matrix using p
np_perm_mat = np.zeros((len(cml), len(cml)))
for idx, i in enumerate(cml):
np_perm_mat[idx, i] = 1
perm_mat = tf.constant(np_perm_mat,dtype=float)
result = tf.matmul(catted, perm_mat)
print("result = \n",sess.run(result))
Which gives output
cml = [2, 5, 8, 0, 1, 3, 4, 6, 7]
result =
[[ 1. 2. 10. 3. 4. 11. 5. 6. 12.]
[ 1. 2. 10. 3. 4. 11. 5. 6. 12.]
[ 1. 2. 10. 3. 4. 11. 5. 6. 12.]]
Even though this doesn't use scatter_nd as the original question asked, one thing I like about this is, you can allocate the perm_mat once in some __init__() method, and hang on to it, and after that initial overhead it's just matrix-matrix multiplication by a sparse, constant matrix, which should be pretty fast. (?)
Still happy to wait and see what other answers might come in.

Categories