I have a very quick question about the notation on accessing an array in python.
Is this line:
trainPredictPlot[look_back:len(trainPredict) + look_back, :] = trainPredict
I've seen arrays are being accessed like this x[a:b] but never like this x[a:b,:]
Can someone explain me with detail what this line of code is doing? What does it mean to put colon before the closing bracket? What about the comma?
When you use x[a:b], it means that you are taking the elements from position "a" (x[a]) to position "b" (x[b]) of a one dimensional array.
For the second case x[a:b,:], it is a two dimensional "a" to position "b" of the first dimension of the array, and all the elements of the second dimension of the array, in other words, from x[a][first element] to x[b][last element].
Related
I have a numpy array total_weights which is an IxI array of floats. Each row/columns corresponds to one of I items.
During my main loop I acquire another real float array weights of size NxM (N, M < I) where each/column row also corresponds to one of the original I items (duplicates may also exist).
I want to add this array to total_weights. However, the sizes and order of the two arrays are not aligned. Therefore, I maintain a position map, a pandas Series with an index of item IDs to their proper index/position in total_weights, called pos_df.
In order to properly make the addition I want I perform the following operation inside the loop:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
total_weights[candidate_pos, :][:, rated_pos] += weights
Unfortunately, the above operation must be editing a copy of the orignal total_weights matrix and not a view of it, since after the loop the total_weights array is still full of zeroes. How do I make it change the original data?
Edit:
I want to clarify that candidate_IDs are the N IDs of items and rated_IDs are the M IDs of items in the NxM array called weights. Through pos_df I can get their total order in all of I items.
Also, my guess as to the reason a copy is returned is that candidate_IDs and thus candidate_pos will probably contain duplicates e.g. [0, 1, 3, 1, ...]. So the same rows will sometimes have to be pulled into the new array/view.
Your first problem is in how you are using indexing. As candidate_pos is an array, total_weights[candidate_pos, :] is a fancy indexing operation that returns a new array. When you apply indexing again, i.e. ...[:, rated_pos] you are assigning elements to the newly created array rather than to total_weights.
The second problem, as you have already spotted, is in the actual logic you are trying to apply. If I understand your example correctly, you have a I x I matrix with weights, and you want to update weights for a sequence of pairs ((Ix_1, Iy_1), ..., (Ix_N, Iy_N)) with repetitions, with a single line of code. This can't be done in this way, using += operator, as you'll find yourself having added to weights[Ix_n, Iy_n] the weight corresponding to the last time (Ix_n, Iy_n) appears in your sequence: you have to first merge all the repeating elements in your sequence of weight updates, and then perform the update of your weights matrix with the new "unique" sequence of updates. Alternatively, you must collect your weights as an I x I matrix, and directly sum it to total_weights.
After #rveronese pointed out that it's impossible to do it one go because of the duplicates in candidate_pos I believe I have managed to do what I want with a for-loop on them:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
for i, c in enumerate(candidate_pos):
total_weights[c, rated_pos] += weights[i, :]
In this case, the indexing does not create a copy and the assignment should be working as expected...
I have a 3D array with the shape (9, 100, 7200). I want to remove the 2nd half of the 7200 values in every row so the new shape will be (9, 100, 3600).
What can I do to slice the array or delete the 2nd half of the indices? I was thinking np.delete(arr, [3601:7200], axis=2), but I get an invalid syntax error when using the colon.
Why not just slicing?
arr = arr[:,:,:3600]
The syntax error occurs because [3601:7200] is not valid python. I assume you are trying to create a new array of numbers to pass as the obj parameter for the delete function. You could do it this way using something like the range function:
np.delete(arr, range(3600,7200), axis=2)
keep in mind that this will not modify arr, but it will return a new array with the elements deleted. Also, notice I have used 3600 not 3601.
However, its often better practice to use slicing in a problem like this:
arr[:,:,:3600]
This gives your required shape. Let me break this down a little. We are slicing a numpy array with 3 dimensions. Just putting a colon in means we are taking everything in that dimension. :3600 means we are taking the first 3600 elements in that dimension. A better way to think about deleting the last have, is to think of it as keeping the first half.
I need a transformation of a tensor which is very similar to roll. The difference is that I do not want the values from the end of the axis to appear in the beginning. In other words I want, for example, the 2nd element to be on 3d place but I do not want the last element to become the first one. Instead, I want the first elements to be zeros.
I have tried this:
prev_xs = tf.roll(xs, shift = 1, axis = 1)
prev_xs[:,0] = 0.0
However, it does not work because
TypeError: 'Tensor' object does not support item assignment
So, what is the proper solution of the problem?
You could use
prev_xs = tf.concat((tf.zeros([tf.shape(xs)[0], 1]), xs[:, :1]), axis=1)
Step-by-step, we discard the last column of xs by indexing like [:, :1]. We create a column of zeros with the appropriate number of rows. Then we concatenate it in front of xs, pushing every column back by 1.
I'm working on a machine learning project for university and I'm having trouble understanding some bits of code online. Here's the example:
digits = np.loadtxt(raw_data, delimiter=",")
x_train, y_train = digits[:,:-1], digits[:,-1:].squeeze()
What do the slices done in the second line mean? I'm trying to make a slice selecting the first 2/3 of the array and I've done before by something like [:2*array_elements // 3], but I don't understand how to do it if there's a delimiter in half.
numpy (or anything, but this seems like numpy) can implement __getitem__ to accept tuples instead of what stdlib does, where only scalar values are accepted (afaik) (e.g. integers, strings, slice objects).
You want to look at the slice "parts" individually, as specified by , delimiters. So [:,:-1] is actually : and :-1, are are completely independent.
First slice
: is "all", no slicing along that axis.
:x is all up until (and not including) x and -1 means the last element, so...
:-1 is all up until (and not including) the last.
Second slice
x: is all after (and including) x, and we already know about -1 so...
-1: is all after (and including) the last -- in this case just the last.
There are two mechanisms involved here.
The python's notation for slicing array : Understanding Python's slice notation
Basically the syntax is array[x:y] where the resulting slice starts at x (included) and end at y (excluded).
If start (resp. end) is omitted it means "from the first item" (resp. "to the last item) (This is a shortcut).
Also the notation is cyclic :
array[-1:0]
# The elements between the last index - 1 and the first (in this order).
# Which means the elements between the last index -1 and the last index
# Which means a list containing only the last element
array[-1:] = [array[-1]]
The numpy's 2-dimensionnal arrays (assuming the np is for numpy) :
Numpy frequently uses arrays of 2 dimensions like a matrix. So to access the element in row x and column y you can write it matrix[x,y]
Plus the python's notation for slicing arrays also apply here to slice matrix into a sub-matrix of smaller size
So, back at your problem:
digits[:,:-1]
= digits[start:end , start:-1]
= digits[start:end , start:end-1]
= the sub-matrix where you take all the rows (start:end) and you take all the columns except the last one (start:end-1)
And
digit[:,-1:]
= digit[start:end, -1:start]
= digit[start:end, -1:end]
= sub-matrix with all the rows and only the last column
What is the difference in below two lines.
I know [::-1] will reverse the matrix. but I want to know what [::] on LHS side '=' does, as without iterating each element how matrix gets reversed in-place in case of 1st case.
matrix[::] = matrix[::-1]
matrix = matrix[::-1]
The technic you are looking for called slicing. It is an advanced way to reference elements in some container. Instead of using single index you can use a slice to reference a range of elements.
The slice consists of start, end and step, like this matrix[start:end:step]. You can skip some parts and defaults values will be taken - 0, len(matrix), 1.
Of course, a container must support this technic (protocol).
matrix[::] = # get all elements of the matrix and assign something to them
matrix = # link matrix name with something
matrix[::-1] # get all elements of the matrix in reversed order
So, the first one is actually copying elements in different positions of the same object.
The second one is just linking name matrix with new object constructed from slice of matrix.