I need a transformation of a tensor which is very similar to roll. The difference is that I do not want the values from the end of the axis to appear in the beginning. In other words I want, for example, the 2nd element to be on 3d place but I do not want the last element to become the first one. Instead, I want the first elements to be zeros.
I have tried this:
prev_xs = tf.roll(xs, shift = 1, axis = 1)
prev_xs[:,0] = 0.0
However, it does not work because
TypeError: 'Tensor' object does not support item assignment
So, what is the proper solution of the problem?
You could use
prev_xs = tf.concat((tf.zeros([tf.shape(xs)[0], 1]), xs[:, :1]), axis=1)
Step-by-step, we discard the last column of xs by indexing like [:, :1]. We create a column of zeros with the appropriate number of rows. Then we concatenate it in front of xs, pushing every column back by 1.
Related
I have a numpy array total_weights which is an IxI array of floats. Each row/columns corresponds to one of I items.
During my main loop I acquire another real float array weights of size NxM (N, M < I) where each/column row also corresponds to one of the original I items (duplicates may also exist).
I want to add this array to total_weights. However, the sizes and order of the two arrays are not aligned. Therefore, I maintain a position map, a pandas Series with an index of item IDs to their proper index/position in total_weights, called pos_df.
In order to properly make the addition I want I perform the following operation inside the loop:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
total_weights[candidate_pos, :][:, rated_pos] += weights
Unfortunately, the above operation must be editing a copy of the orignal total_weights matrix and not a view of it, since after the loop the total_weights array is still full of zeroes. How do I make it change the original data?
Edit:
I want to clarify that candidate_IDs are the N IDs of items and rated_IDs are the M IDs of items in the NxM array called weights. Through pos_df I can get their total order in all of I items.
Also, my guess as to the reason a copy is returned is that candidate_IDs and thus candidate_pos will probably contain duplicates e.g. [0, 1, 3, 1, ...]. So the same rows will sometimes have to be pulled into the new array/view.
Your first problem is in how you are using indexing. As candidate_pos is an array, total_weights[candidate_pos, :] is a fancy indexing operation that returns a new array. When you apply indexing again, i.e. ...[:, rated_pos] you are assigning elements to the newly created array rather than to total_weights.
The second problem, as you have already spotted, is in the actual logic you are trying to apply. If I understand your example correctly, you have a I x I matrix with weights, and you want to update weights for a sequence of pairs ((Ix_1, Iy_1), ..., (Ix_N, Iy_N)) with repetitions, with a single line of code. This can't be done in this way, using += operator, as you'll find yourself having added to weights[Ix_n, Iy_n] the weight corresponding to the last time (Ix_n, Iy_n) appears in your sequence: you have to first merge all the repeating elements in your sequence of weight updates, and then perform the update of your weights matrix with the new "unique" sequence of updates. Alternatively, you must collect your weights as an I x I matrix, and directly sum it to total_weights.
After #rveronese pointed out that it's impossible to do it one go because of the duplicates in candidate_pos I believe I have managed to do what I want with a for-loop on them:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
for i, c in enumerate(candidate_pos):
total_weights[c, rated_pos] += weights[i, :]
In this case, the indexing does not create a copy and the assignment should be working as expected...
I know how to access elements in a vector by indices doing:
test = numpy.array([1,2,3,4,5,6])
indices = list([1,3,5])
print(test[indices])
which gives the correct answer : [2 4 6]
But I am trying to do the same thing using a 2D matrix, something like:
currentGrid = numpy.array( [[0, 0.1],
[0.9, 0.9],
[0.1, 0.1]])
indices = list([(0,0),(1,1)])
print(currentGrid[indices])
this should display me "[0.0 0.9]" for the value at (0,0) and the one at (1,1) in the matrix. But instead it displays "[ 0.1 0.1]". Also if I try to use 3 indices with :
indices = list([(0,0),(1,1),(0,2)])
I now get the following error:
Traceback (most recent call last):
File "main.py", line 43, in <module>
print(currentGrid[indices])
IndexError: too many indices for array
I ultimately need to apply a simple max() operation on all the elements at these indices and need the fastest way to do that for optimization purposes.
What am I doing wrong ? How can I access specific elements in a matrix to do some operation on them in a very efficient way (not using list comprehension nor a loop).
The problem is the arrangement of the indices you're passing to the array. If your array is two-dimensional, your indices must be two lists, one containing the vertical indices and the other one the horizontal ones. For instance:
idx_i, idx_j = zip(*[(0, 0), (1, 1), (0, 2)])
print currentGrid[idx_j, idx_i]
# [0.0, 0.9, 0.1]
Note that the first element when indexing arrays is the last dimension, e.g.: (y, x). I assume you defined yours as (x, y) otherwise you'll get an IndexError
There are already some great answers to your problem. Here just a quick and dirty solution for your particular code:
for i in indices:
print(currentGrid[i[0],i[1]])
Edit:
If you do not want to use a for loop you need to do the following:
Assume you have 3 values of your 2D-matrix (with the dimensions x1 and x2 that you want to access. The values have the "coordinates"(indices) V1(x11|x21), V2(x12|x22), V3(x13|x23). Then, for each dimension of your matrix (2 in your case) you need to create a list with the indices for this dimension of your points. In this example, you would create one list with the x1 indices: [x11,x12,x13] and one list with the x2 indices of your points: [x21,x22,x23]. Then you combine these lists and use them as index for the matrix:
indices = [[x11,x12,x13],[x21,x22,x23]]
or how you write it:
indices = list([(x11,x12,x13),(x21,x22,x23)])
Now with the points that you used ((0,0),(1,1),(2,0)) - please note you need to use (2,0) instead of (0,2), because it would be out of range otherwise:
indices = list([(0,1,2),(0,1,0)])
print(currentGrid[indices])
This will give you 0, 0.9, 0.1. And on this list you can then apply the max() command if you like (just to consider your whole question):
maxValue = max(currentGrid[indices])
Edit2:
Here an example how you can transform your original index list to get it into the correct shape:
originalIndices = [(0,0),(1,1),(2,0)]
x1 = []
x2 = []
for i in originalIndices:
x1.append(i[0])
x2.append(i[1])
newIndices = [x1,x2]
print(currentGrid[newIndices])
Edit3:
I don't know if you can apply max(x,0.5) to a numpy array with using a loop. But you could use Pandas instead. You can cast your list into a pandas Series and then apply a lambda function:
import pandas as pd
maxValues = pd.Series(currentGrid[newIndices]).apply(lambda x: max(x,0.5))
This will give you a pandas array containing 0.5,0.9,0.5, which you can simply cast back to a list maxValues = list(maxValues).
Just one note: In the background you will always have some kind of loop running, also with this command. I doubt, that you will get much better performance by this. If you really want to boost performance, then use a for loop, together with numba (you simply need to add a decorator to your function) and execute it in parallel. Or you can use the multiprocessing library and the Pool function, see here. Just to give you some inspiration.
Edit4:
Accidentally I saw this page today, which allows to do exactly what you want with Numpy. The solution (considerin the newIndices vector from my Edit2) to your problem is:
maxfunction = numpy.vectorize(lambda i: max(i,0.5))
print(maxfunction(currentGrid[newIndices]))
2D indices have to be accessed like this:
print(currentGrid[indices[:,0], indices[:,1]])
The row indices and the column indices are to be passed separately as lists.
I'm working on a machine learning project for university and I'm having trouble understanding some bits of code online. Here's the example:
digits = np.loadtxt(raw_data, delimiter=",")
x_train, y_train = digits[:,:-1], digits[:,-1:].squeeze()
What do the slices done in the second line mean? I'm trying to make a slice selecting the first 2/3 of the array and I've done before by something like [:2*array_elements // 3], but I don't understand how to do it if there's a delimiter in half.
numpy (or anything, but this seems like numpy) can implement __getitem__ to accept tuples instead of what stdlib does, where only scalar values are accepted (afaik) (e.g. integers, strings, slice objects).
You want to look at the slice "parts" individually, as specified by , delimiters. So [:,:-1] is actually : and :-1, are are completely independent.
First slice
: is "all", no slicing along that axis.
:x is all up until (and not including) x and -1 means the last element, so...
:-1 is all up until (and not including) the last.
Second slice
x: is all after (and including) x, and we already know about -1 so...
-1: is all after (and including) the last -- in this case just the last.
There are two mechanisms involved here.
The python's notation for slicing array : Understanding Python's slice notation
Basically the syntax is array[x:y] where the resulting slice starts at x (included) and end at y (excluded).
If start (resp. end) is omitted it means "from the first item" (resp. "to the last item) (This is a shortcut).
Also the notation is cyclic :
array[-1:0]
# The elements between the last index - 1 and the first (in this order).
# Which means the elements between the last index -1 and the last index
# Which means a list containing only the last element
array[-1:] = [array[-1]]
The numpy's 2-dimensionnal arrays (assuming the np is for numpy) :
Numpy frequently uses arrays of 2 dimensions like a matrix. So to access the element in row x and column y you can write it matrix[x,y]
Plus the python's notation for slicing arrays also apply here to slice matrix into a sub-matrix of smaller size
So, back at your problem:
digits[:,:-1]
= digits[start:end , start:-1]
= digits[start:end , start:end-1]
= the sub-matrix where you take all the rows (start:end) and you take all the columns except the last one (start:end-1)
And
digit[:,-1:]
= digit[start:end, -1:start]
= digit[start:end, -1:end]
= sub-matrix with all the rows and only the last column
What is the difference in below two lines.
I know [::-1] will reverse the matrix. but I want to know what [::] on LHS side '=' does, as without iterating each element how matrix gets reversed in-place in case of 1st case.
matrix[::] = matrix[::-1]
matrix = matrix[::-1]
The technic you are looking for called slicing. It is an advanced way to reference elements in some container. Instead of using single index you can use a slice to reference a range of elements.
The slice consists of start, end and step, like this matrix[start:end:step]. You can skip some parts and defaults values will be taken - 0, len(matrix), 1.
Of course, a container must support this technic (protocol).
matrix[::] = # get all elements of the matrix and assign something to them
matrix = # link matrix name with something
matrix[::-1] # get all elements of the matrix in reversed order
So, the first one is actually copying elements in different positions of the same object.
The second one is just linking name matrix with new object constructed from slice of matrix.
I have this MATLAB code that I need to translate to python, however there is an issue in creating a new column in the firings array. In MATLAB, the code creates an n*2 matrix that is initially empty and I want to be able to do the same in python. Using NumPy, I created fired = np.where(v >= 30). However python creates a tuple rather than an array so it throws an error:
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'
This is the code I have in MATLAB that I would like converted into Python
firings=[];
firings=[firings; t+0*fired, fired];
Help is appreciated! Thanks!
np.where generates a two-element tuple if the array is 1D in nature. For the 1D case, you would need to access the first element of the result of np.where only:
fired = np.where(v >= 30)[0]
You can then go ahead and concatenate the matrices. Also a suggestion provided by user #Divakar would be to use np.flatnonzero which would equivalently find the non-zero values in a NumPy array and flattened into a 1D array for less headaches:
fired = np.flatnonzero(v >= 30)
Take note that the logic to concatenate would not work if there were no matches found in fired. You will need to take this into account when you look at your concatenating logic. The convenient thing with MATLAB is that you're able to concatenate empty matrices and the result is no effect (obviously).
Also note that there is no conception of a row vector or column vector in NumPy. It is simply a 1D array. If you want to specifically force the array to be a column vector as you have it, you need to introduce a singleton axis in the second dimension for you to do this. Note that this only works provided that np.where gave you matched results. After, you can use np.vstack and np.hstack to vertically and horizontally concatenate arrays to help you do what you ask. What you have to do first is create a blank 2D array, then do what we just covered:
firings = np.array([[]]) # Create blank 2D array
# Some code here...
# ...
# ...
# fired = find(v >= 30); % From MATLAB
fired = np.where(v >= 30)[0]
# or you can use...
# fired = np.flatnonzero(v >= 30)
if np.size(fired) != 0:
fired = fired[:, None] # Introduce singleton axis
# Update firings with two column vectors
# firings = [firings; t + 0 * fired, fired]; % From MATLAB
firings = np.vstack([firings, np.hstack([t + 0*fired, fired])])
Here np.size finds the total number of elements in the NumPy array. If the result of np.where generated no results, the number of elements in fired should be 0. Therefore the if statement only executes if we have found at least one element in v subject to v >= 30.
If you use numpy, you can define an ndarray:
import numpy as np
firings=np.ndarray(shape=(1,2)
firings[0][0:]=(1.,2.)
firings=np.append(firings,[[3.,4.]],axis=0)