I am getting error in following code:
x = cntk.input_variable(shape=c(8,3,1))
y = cntk.sequence.slice(x,1,0)
x0 = np.reshape(np.arange(48.0,dtype=np.float32),(2,8,1,3))
y.eval({x:x0})
Error : Sequences must be a least one frame long
But when I run this it runs fine :
x = cntk.input_variable(shape=c(3,2)) #change
y = cntk.sequence.slice(x,1,0)
x0 = np.reshape(np.arange(24.0,dtype=np.float32),(1,8,1,3)) #change
y.eval({x:x0})
I am not able to understand few things which slice method :
At what array level it's going to slice.
Acc. to documentation, second argument is begin_index, and next to it it end_index. How can being_index be greater than end_index.
There are two versions of slice(), one for slicing tensors, and one for slicing sequences. Your example uses the one for sequences.
If your inputs are sequences (e.g. words), the first form, cntk.slice(), would individually slice every element of a sequence and create a sequence of the same length that consists of sliced tensors. The second form, cntk.sequence.slice(), will slice out a range of entries from the sequence. E.g. cntk.sequence.slice(x, 13, 42) will cut out sequence items 13..41 from x, and create a new sequence of length (42-13).
If you intended to experiment with the first form, please change to cntk.slice(). If you meant the sequence version, please try to enclose x0 in an additional [...]. The canonical form of passing minibatch data is as a list of batch entries (e.g. MB size of 128 --> a list with 128 entries), where each batch entry is a tensor of shape (Ti,) + input_shape where Ti is the sequence length of the respective sequence. This
x0 = [ np.reshape(np.arange(48.0,dtype=np.float32),(2,8,1,3)) ]
would denote a minibatch with a single entry (1 list entry), where the entry is a sequence of 2 sequence items, where each sequence item has shape (8,1,3).
The begin and end indices can be negative, in order to index from the end (similar to Python slices). Unlike Python however, 0 is a valid end index that refers to the end.
Related
I have a numpy array total_weights which is an IxI array of floats. Each row/columns corresponds to one of I items.
During my main loop I acquire another real float array weights of size NxM (N, M < I) where each/column row also corresponds to one of the original I items (duplicates may also exist).
I want to add this array to total_weights. However, the sizes and order of the two arrays are not aligned. Therefore, I maintain a position map, a pandas Series with an index of item IDs to their proper index/position in total_weights, called pos_df.
In order to properly make the addition I want I perform the following operation inside the loop:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
total_weights[candidate_pos, :][:, rated_pos] += weights
Unfortunately, the above operation must be editing a copy of the orignal total_weights matrix and not a view of it, since after the loop the total_weights array is still full of zeroes. How do I make it change the original data?
Edit:
I want to clarify that candidate_IDs are the N IDs of items and rated_IDs are the M IDs of items in the NxM array called weights. Through pos_df I can get their total order in all of I items.
Also, my guess as to the reason a copy is returned is that candidate_IDs and thus candidate_pos will probably contain duplicates e.g. [0, 1, 3, 1, ...]. So the same rows will sometimes have to be pulled into the new array/view.
Your first problem is in how you are using indexing. As candidate_pos is an array, total_weights[candidate_pos, :] is a fancy indexing operation that returns a new array. When you apply indexing again, i.e. ...[:, rated_pos] you are assigning elements to the newly created array rather than to total_weights.
The second problem, as you have already spotted, is in the actual logic you are trying to apply. If I understand your example correctly, you have a I x I matrix with weights, and you want to update weights for a sequence of pairs ((Ix_1, Iy_1), ..., (Ix_N, Iy_N)) with repetitions, with a single line of code. This can't be done in this way, using += operator, as you'll find yourself having added to weights[Ix_n, Iy_n] the weight corresponding to the last time (Ix_n, Iy_n) appears in your sequence: you have to first merge all the repeating elements in your sequence of weight updates, and then perform the update of your weights matrix with the new "unique" sequence of updates. Alternatively, you must collect your weights as an I x I matrix, and directly sum it to total_weights.
After #rveronese pointed out that it's impossible to do it one go because of the duplicates in candidate_pos I believe I have managed to do what I want with a for-loop on them:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
for i, c in enumerate(candidate_pos):
total_weights[c, rated_pos] += weights[i, :]
In this case, the indexing does not create a copy and the assignment should be working as expected...
a=np.random.dirichlet(np.ones(3),size=1)
I want to use three numbers, where they sum up to 1. However, I noticed that a[0] will be:
array([0.24414272, 0.01769199, 0.7381653 ])
an index that already contains three elements.
Is there any way to split them into three indices?
The default behavior if you don't pass size is to return a single dimensional array with the specified elements, per the docstring on the function:
size : int or tuple of ints, optional
Output shape. If the give shape is, e.g., (m, n), then m * n * k [where k is size of input and sample sequences] samples are drawn. Default is None, in which case a vector of length k is returned.
By passing size=1, you explicitly tell it to make a multidimensional array of size samples (so, 1 sample, making the outer dimension 1), where not passing size (or passing size=None) would still make just one set of samples, as a single 1D array.
Short version: If you just drop the ,size=1 from your call, you'll get what you want.
If that's the only thing you want, then this should work:
a=np.random.dirichlet(np.ones(3),size=1)[0]
I am incremently sampling a batch of size torch.Size([n, 8]).
I also have a list valid_indices of length n which contains tuples of indices that are valid for each entry in the batch.
For instance valid_indices[0] may look like this: (0,1,3,4,5,7) , which suggests that indices 2 and 6 should be excluded from the first entry in batch along dim 1.
Particularly I need to exclude these values for when I use torch.max(batch, dim=1, keepdim=True).
Indices to be excluded (if any) may differ from entry to entry within the batch.
Any ideas? Thanks in advance.
I assume that you are getting the good old
IndexError: too many indices for tensor of dimension 1
error when you use your tuple indices directly on the tensor.
At least that was the error that I was able to reproduce when I execute the following line
t[0][valid_idx0]
Where t is a random tensor with size (10,8) and valid_idx0 is a tuple with 4 elements.
However, same line works just fine when you convert your tuple to a list as following
t[0][list(valid_idx0)]
>>> tensor([0.1847, 0.1028, 0.7130, 0.5093])
But when it comes to applying these indices to 2D tensors, things get a bit different, since we need to preserve the structure of our tensor for batch processing.
Therefore, it would be reasonable to convert our indices to mask arrays.
Let's say we have a list of tuples valid_indices at hand. First thing will be converting it to a list of lists.
valid_idx_list = [list(tup) for tup in valid_indices]
Second thing will be converting them to mask arrays.
masks = np.zeros((t.size()))
for i, indices in enumerate(valid_idx_list):
masks[i][indices] = 1
Done. Now we can apply our mask and use the torch.max on the masked tensor.
torch.max(t*masks)
Kindly see the colab notebook that I've used to reproduce the problem.
https://colab.research.google.com/drive/1BhKKgxk3gRwUjM8ilmiqgFvo0sfXMGiK?usp=sharing
I have a sequences collection in the following form:
sequences = torch.tensor([[2,1],[5,6],[3,0])
indexes = torch.tensor([1,0,1])
that is, the sequence 0 is made of just [5,6], and the sequence 1 is made of [2,1] , [3,0]. Mathematically sequence[i] = { sequences[j] such that i = indexes[j] }
I need to feed these sequences into an LSTM. Since these are variable-length sequences, pytorch documentation states to use something like torch.nn.utils.rnn.pack_sequence.
Sadly, this method and its like want, as input, a list of tensors where each of them is a L x *, with L being the length of the single sequence.
How can build something that can be fed into a pytorch LSTM?
P.s. throughout the code I work with these tensors using scatter and gather functionalities but I can't find a way to use them to achieve this goal.
First of all, you need to separate your sequences. Pack_sequence accepts a list of tensors, each tensor being the shape L x *. The other dimensions must always be the same for all sequences, but L, or the sequence length can be varying. For example, your sequence 0 and 1 can be packed as:
sequences = [torch.tensor([[5,6]]), torch.tensor([[2,1],[3,0]])]
packed_seq = torch.nn.utils.rnn.pack_sequence(sequences, enforce_sorted=False)
Here, in sequences, sequences[0] is of shape (1,2) while sequences[1] is of shape (2,2). The first dimension represents their length, which is 1 and 2 respectively.
You can separate the sequences by:
sequences = torch.tensor([[2,1],[5,6],[3,0]])
indexes = torch.tensor([1,0,1])
num_seq = np.unique(indexes)
sequences = [sequences[indexes==seq_id] for seq_id in num_seq]
This creates sequences=[torch.tensor([[5,6]]), torch.tensor([[2,1],[3,0]])].
I found an alternative and more efficient way to separate the sequences:
sequences = torch.tensor([[2,1],[5,6],[3,0]])
indexes = torch.tensor([1,0,1])
sorted_src = src[indexes.argsort()]
indexes_count = torch.unique(indexes, return_counts=True)[1]
splitted = torch.split(sorted_src, indexes_count.tolist(), dim=0)
This method is almost 3 times faster then the one proposed by #Mercury.
Measured using timeit module with sequences being a (5000,256) tensor and indexes being (1500)
I'm working on a machine learning project for university and I'm having trouble understanding some bits of code online. Here's the example:
digits = np.loadtxt(raw_data, delimiter=",")
x_train, y_train = digits[:,:-1], digits[:,-1:].squeeze()
What do the slices done in the second line mean? I'm trying to make a slice selecting the first 2/3 of the array and I've done before by something like [:2*array_elements // 3], but I don't understand how to do it if there's a delimiter in half.
numpy (or anything, but this seems like numpy) can implement __getitem__ to accept tuples instead of what stdlib does, where only scalar values are accepted (afaik) (e.g. integers, strings, slice objects).
You want to look at the slice "parts" individually, as specified by , delimiters. So [:,:-1] is actually : and :-1, are are completely independent.
First slice
: is "all", no slicing along that axis.
:x is all up until (and not including) x and -1 means the last element, so...
:-1 is all up until (and not including) the last.
Second slice
x: is all after (and including) x, and we already know about -1 so...
-1: is all after (and including) the last -- in this case just the last.
There are two mechanisms involved here.
The python's notation for slicing array : Understanding Python's slice notation
Basically the syntax is array[x:y] where the resulting slice starts at x (included) and end at y (excluded).
If start (resp. end) is omitted it means "from the first item" (resp. "to the last item) (This is a shortcut).
Also the notation is cyclic :
array[-1:0]
# The elements between the last index - 1 and the first (in this order).
# Which means the elements between the last index -1 and the last index
# Which means a list containing only the last element
array[-1:] = [array[-1]]
The numpy's 2-dimensionnal arrays (assuming the np is for numpy) :
Numpy frequently uses arrays of 2 dimensions like a matrix. So to access the element in row x and column y you can write it matrix[x,y]
Plus the python's notation for slicing arrays also apply here to slice matrix into a sub-matrix of smaller size
So, back at your problem:
digits[:,:-1]
= digits[start:end , start:-1]
= digits[start:end , start:end-1]
= the sub-matrix where you take all the rows (start:end) and you take all the columns except the last one (start:end-1)
And
digit[:,-1:]
= digit[start:end, -1:start]
= digit[start:end, -1:end]
= sub-matrix with all the rows and only the last column