Python Code not clear (arrays)

Python Code not clear (arrays) - python

I have the following lines:
Xtest = numpy.arange(-15,15,0.1)
Xtest = numpy.array([Xtest,Xtest*0+1]).T
Why does the second line look like this in the sense of "Xtest*0+1" ? I've tried
Xtest = numpy.array([Xtest,1]).T
I get the same output except that at the end of the array I have "dtype=object". Why is that?
Also, not clear what happens when I try
Xtest = numpy.array([Xtest,Xtest*0]).T
The output is unclear to me. Thought that I would have Xtest column with the column of 0's...
Finally,
Xtest =numpy.array([Xtest,0]).T
Why am I getting the second column with ones instead of zeros?

Since Xtest is an array, it has more than one entry. When you multiply it by zero, you have that many zeroes. Then you add one to make it into an array full of one's. In contrast, when you directly put in 1, you end up with a single 1, which is different.

Related

Changing the values of sliced numpy array doesn't change the original data in it

I have a numpy array total_weights which is an IxI array of floats. Each row/columns corresponds to one of I items.
During my main loop I acquire another real float array weights of size NxM (N, M < I) where each/column row also corresponds to one of the original I items (duplicates may also exist).
I want to add this array to total_weights. However, the sizes and order of the two arrays are not aligned. Therefore, I maintain a position map, a pandas Series with an index of item IDs to their proper index/position in total_weights, called pos_df.
In order to properly make the addition I want I perform the following operation inside the loop:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
total_weights[candidate_pos, :][:, rated_pos] += weights
Unfortunately, the above operation must be editing a copy of the orignal total_weights matrix and not a view of it, since after the loop the total_weights array is still full of zeroes. How do I make it change the original data?
Edit:
I want to clarify that candidate_IDs are the N IDs of items and rated_IDs are the M IDs of items in the NxM array called weights. Through pos_df I can get their total order in all of I items.
Also, my guess as to the reason a copy is returned is that candidate_IDs and thus candidate_pos will probably contain duplicates e.g. [0, 1, 3, 1, ...]. So the same rows will sometimes have to be pulled into the new array/view.

Your first problem is in how you are using indexing. As candidate_pos is an array, total_weights[candidate_pos, :] is a fancy indexing operation that returns a new array. When you apply indexing again, i.e. ...[:, rated_pos] you are assigning elements to the newly created array rather than to total_weights.
The second problem, as you have already spotted, is in the actual logic you are trying to apply. If I understand your example correctly, you have a I x I matrix with weights, and you want to update weights for a sequence of pairs ((Ix_1, Iy_1), ..., (Ix_N, Iy_N)) with repetitions, with a single line of code. This can't be done in this way, using += operator, as you'll find yourself having added to weights[Ix_n, Iy_n] the weight corresponding to the last time (Ix_n, Iy_n) appears in your sequence: you have to first merge all the repeating elements in your sequence of weight updates, and then perform the update of your weights matrix with the new "unique" sequence of updates. Alternatively, you must collect your weights as an I x I matrix, and directly sum it to total_weights.

After #rveronese pointed out that it's impossible to do it one go because of the duplicates in candidate_pos I believe I have managed to do what I want with a for-loop on them:
candidate_pos = pos_df.loc[candidate_IDs] # don't worry about how I get these
rated_pos = pos_df.loc[rated_IDs] # ^^
for i, c in enumerate(candidate_pos):
total_weights[c, rated_pos] += weights[i, :]
In this case, the indexing does not create a copy and the assignment should be working as expected...

Dataframe slicing with more than two dimensions

So I'm going through a machine learning tutorial and I'm met with this line of code:
pred_list = []
batch = train[-n_input:].reshape((1, n_input, n_features))
for i in range(n_input):
pred_list.append(model.predict(batch)[0])
batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1)
Specifically, what happens inside the for loop. I understand that the first line of code grabs the first value of whatever is predicted, this is only one value. Next it appends the value to the end of batch, this is where I'm confused.
Why is batch in the second line of code batch[:,1:,:]? What does that mean? I'm not too sure about dataframe slicing, can someone explain what the second line of code in the for loop means? It would be very much appreciated. Here's the article in question. Thank you for reading.

Seems to me batch is a numpy array with 3 dimensions of shape (1, n_input, n_features), 1 row, n_input columns, and n_features depths. batch[:,1:,:] would be a slice of batch that gets from second to last columns of batch (python is 0-based indexing). I am guessing these columns represent inputs, i.e. all the features of inputs 1 to last.
batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1) appends [[pred_list[i]]] to that slice of batch along axis=1 which is columns. So I am guessing it removes the first input from batch and appends the new [[pred_list[i]]] as last input to batch and re-do this for all inputs in batch.

ndarray can be indexed in two way,
arr = np.array([[[1,2,3],
[3,4,5],
[7,8,9]]])
Either
arr[1][0][2] #row, col, layer
or
arr[1,0,2] #row, col, layer
First index gives you the row, second col, third layer and so on. Both the methods will give you the element present in the 2nd row, 1st column and 3rd layer.
batch[:,1:,:] means you want all the rows, all the columns following the 1st column and all the layers.
P.S
I have used the word layers here, if you know a better word do suggest.

Exluding certain values in a matrix and matching it with another matrix it is based on?

I have two matrices, one with data and another with its flags (either True or False) of the same shape.
I want to somehow remove/disregard the true values in the flags matrix, and give the data the same shape with the new flag matrix without any trues.
In other words, I want a data set without any indices that are flagged as true.
Their shapes are (82680, 1, 1024, 4). Any method would be appreciated.
uvraw.data_array #data matrix
uvraw.flags_array #flags matrix (same shape)
I originally tried something like this:
flags = uvraw.flag_array
flag = []
for index, f in enumerate(flags):
if np.any(f):
flag.append(index)
data = uvraw.data_array
uvraw_noflags = np.sum(data(~flags))
but i know i did the index part and the last line wrong. Can anyone help me with something that somehow sets a new matrix with all values except the indices noted in flag.
Update: the end point for all of this is to be able to run a line of code something like "np.sum(~flags)" in which i take the sum of all relevant data, and average it.

How to roll without periodic boundaries in TensorFlow?

I need a transformation of a tensor which is very similar to roll. The difference is that I do not want the values from the end of the axis to appear in the beginning. In other words I want, for example, the 2nd element to be on 3d place but I do not want the last element to become the first one. Instead, I want the first elements to be zeros.
I have tried this:
prev_xs = tf.roll(xs, shift = 1, axis = 1)
prev_xs[:,0] = 0.0
However, it does not work because
TypeError: 'Tensor' object does not support item assignment
So, what is the proper solution of the problem?

You could use
prev_xs = tf.concat((tf.zeros([tf.shape(xs)[0], 1]), xs[:, :1]), axis=1)
Step-by-step, we discard the last column of xs by indexing like [:, :1]. We create a column of zeros with the appropriate number of rows. Then we concatenate it in front of xs, pushing every column back by 1.

Does Numpy array in Python select 'True' values by default when slicing data

import numpy as np
test = np.array([1, 1, 1, 0, 0, 0, 1])
t = test[test == 1]
print(t)
When I print 't' why is the code always going to print the "1" (i.e. "True") values only? Is there a behavior within a Numpy series to only select true values by default unless I define whether true or false?
I'm going through the Data Camp Python course and couldn't find the answer so reaching out to this group for help.
Thank you.

Obviously, you have written the variable t to be equal to the number 1 only. Whenever you print t the output will always be 1s. Try adding more different numbers to the list and print t again. You will also receive the array of 1s again.
When changing the value of the variable test to, say, 3, You will get at least a 3 from the array if exists. Otherwise, the array will be empty.
The same thing goes to the strings like test == car.
Update: slicing using index
Basically, you have been outputting the numbers from their indices.
Let's see the following array
x = np.array([1,0,1,1,1,0)
Now, let's output 0,1,0, 0,0,1 by typing x[x]
Let's break the operation into pieces.
x is a variable for the numbers in the array. However, when including x in a square brackets like this x[x], then you are literally ordering the program to output the first two numbers only from the array using the index.
Let's be more explicit. The indices of the array 1,0,1,1,1,0 are 0,1,2,3,45.
When typing x[0], then 1 will output only. Okay, let's try x[1]. It will output 0. This is how the index works. x[x] will output 0,1,0,0,0,1 due to the indices given.
Hope this answers your question.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Code not clear (arrays) - python

Since Xtest is an array, it has more than one entry. When you multiply it by zero, you have that many zeroes. Then you add one to make it into an array full of one's. In contrast, when you directly put in 1, you end up with a single 1, which is different.

Related

Changing the values of sliced numpy array doesn't change the original data in it

Dataframe slicing with more than two dimensions

Exluding certain values in a matrix and matching it with another matrix it is based on?

How to roll without periodic boundaries in TensorFlow?

Does Numpy array in Python select 'True' values by default when slicing data

Categories

Resources