Understanding the slicing of NumPy array - python

I haven't understood the output of the following program:
import numpy as np
myList = [[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]]
myNumpyArray = np.array(myList)
print(myNumpyArray[0:3, 1:3])
Output
[[ 2 3]
[ 6 7]
[10 11]]
What I knew that would be the intersection of all rows, and 2nd to 4th columns. In that logic, the output should be:
2 3 4
6 7 8
10 11 12
14 15 16
What am I missing here?

The ending indices (the 3's in 0:3 and 1:3) are exclusive, not inclusive, while the starting indices (0 and 1) are in fact inclusive. If the ending indices were inclusive, then the output would be as you expect. But because they're exclusive, you're actually only grabbing rows 0, 1, and 2, and columns 1 and 2. The output is the intersection of those, which is equivalent to the output you're seeing.
If you are trying to get the data you expect, you can do myNumpyArray[:, 1:]. The : simply grabs all the elements of the array (in your case, in the first dimension of the array), and the 1: grabs all the content of the array starting at index 1, ignoring the data in the 0th place.

This is a classic case of just needing to understand slice notation.
inside the brackets, you have the slice for each dimension:
arr[dim1_start:dim1_end, dim2_start, dim2_end]
For the above notation, the slice will include the elements starting at dimX_start, up to, and not including, dimX_end.
So, for what you wrote: myNumpyArray[0:3, 1:3]
you selected rows 0, 1, and 2 (not including 3) and columns 1 and 2 (not including 3)
I hope that helps explain your results.
For the result you were expecting, you would need something more like:
print(myNumpyArray[0:4, 1:4])
For more info on slicing, you might go to the numpy docs or look at a similar question posted a while back.

Related

python homework interleaving list

The problem statement is:
Design and implement an algorithm that displays the elements of a list
by interleaving an element from the beginning and an element from the
end.
For example, input:
1 2 3 4 5 6 7 8
Output :
1 8 2 7 3 6 4 5
This is what I tried, but I don't know what happen with elements 7 and 8:
lista = [1, 2, 3, 4, 5, 6, 7, 8]
for i in range(len(lista)):
lista.insert(2*i-1,lista.pop())
print("The list after shift is : " + str(lista))
# output:
# The list after shift is : [1, 7, 2, 8, 3, 6, 4, 5]
The only error in you code, is that range(len(lista)) starts from 0, not from 1. By starting from zero, in the first iteration 2*i-1 will be 2*0-1 = -1, and hence lista.insert(-1,lista.pop()), which means inserting at the very end of the list (that is what index -1 means in python).
To fix your code, you just need to start the range from 1. Actually, you are iterating too much, you can have your range just from 1 to the half of your list, like this:
lista = [1, 2, 3, 4, 5, 6, 7, 8]
for i in range(1, len(lista)//2):
lista.insert(2*i-1,lista.pop())
print("The list after shift is : " + str(lista))
# output:
# The list after shift is : [1, 8, 2, 7, 3, 6, 4, 5]
When you become more familiarized with the language, you will see that this can be accomplished much more easily.
For example, you can use the python slice syntax to achieve your goal. You slice from beginning to half , and from end to half (step of -1), then zip then together and flat.
[i for z in zip(lista[:4],lista[:-5:-1]) for i in z]
# [1, 8, 2, 7, 3, 6, 4, 5]
Another option:
import math
lista = [1, 2, 3, 4, 5, 6, 7, 8]
ans = []
for i in range(math.floor(len(lista)/2)):
ans.append(lista[i])
ans.append(lista[-i-1])
if (len(lista) % 2) != 0:
ans.append(lista(math.ceil(len(lista)/2)))
print(ans)
Technically speaking, I'd say it's two off-by-one errors (or one off-by-one error, but from -1 to +1, you'll see what I mean in the second paragraph). The first one is that you're subtracting 1 when you shouldn't. In the case when i = 0 (remember that range(n) goes from 0 to n-1), the insert position is being evaluated as 2*0-1 = (2*0)-1 = 0-1= -1 (for insert() method, that's the last position of the original list, pushing what was there forward, so it'll be the penultimate position of the NEW list).
But, when you remove the -1, the output becomes 8 1 7 2 6 3 5 4, which is close to what you want, but not quite right. What's missing is that the elements inserted should be at positions 1, 3, 5, 7, and not 0, 2, 4, 6. So, you'll actually need to add 1.
So, the shortest change to fix your code is to change lista.insert(2*i-1,lista.pop()) to lista.insert(2*i+1,lista.pop()).
Notice: if you put a print inside for, you'll realize that, after changing half the elements, the output is already right. That's because when len(lista) is 8, and you do lista.insert(x, lista.pop()) where x is bigger than 8, basically you're removing the last element (pop) and adding it at the end, so, nothing changes. Hence, you could also change range(len(lista)) to range(len(lista)//2). Test if it'll work when len(lista) is odd

How to update the array value and perform the same operation for the new values

import math
t=int(input())
for i in range(0,t):
n,x=map(int,input().split())
arr=list(map(int,input().split()))[:n]
if x>=n:
for i in range(0,n):
i=0
j=1
p=1
p=int(p)
print(p)
arr[i]=arr[i]^(2**p)
arr[j]=arr[j]^(2**p)
i+=1
j+=1
print(arr)
My array value is not updating .for example if n=4 x=4 and arr= [2,5,5,9],after first operation the array becomes [0, 7, 5, 9] and then i want to perform this operation for my new array which is [0,7,5,9] but it still performs on my previous array which [2,5,5,9] .I am new to python kindly help me. Thanks in adavnced.
Your code with n=4, x=4, and arr=[2,5,5,9] executes the for loop 4 times and produces the output below (excluding printing p):
[0, 7, 5, 9]
[2, 5, 5, 9]
[0, 7, 5, 9]
[2, 5, 5, 9]
The reason why it produces the output above is related to the XOR (^) operations you perform. Recall that a ^ a = 0 and a ^ 0 = a. Now let's see what happens:
In your for loop, you override i (you hardcode it to i=0) and you have j=1 for all the iterations. This means you never modify arr[2] and arr[3]; this is why their values remain unchanged, 5 and 9, respectively. (If you want to run the loop for n times, perhaps you want to consider using a for loop with another variable name, say, k, or a while loop).
arr[0] happens to be 2 and 2**p is also 2, because p=1 for all the iterations. So, 0 is the result of XOR-ing 2 with itself an even number of times, and 2 is the result of XOR-ing 2 with itself an odd number of times. For example, in the first iteration you actually do 2 ^ 2 and get 0, in the second 2 ^ 2 ^ 2 = 2 ^ 0 and get 2, and so on and so forth.
arr[1] is not 2, but more or less the same logic as above applies. In the first iteration, you have 5 ^ 2 (recall p=1 so 2**p=2) , which is 7, in the second iteration you get back your original number 5 because 5 ^ 2 ^ 2 = 5 ^ 0, and so on and so forth.

Slicing a 2D NumPy Array by all zero rows

This is essentially the 2D array equivalent of slicing a python list into smaller lists at indexes that store a particular value. I'm running a program that extracts a large amount of data out of a CSV file and copies it into a 2D NumPy array. The basic format of these arrays are something like this:
[[0 8 9 10]
[9 9 1 4]
[0 0 0 0]
[1 2 1 4]
[0 0 0 0]
[1 1 1 2]
[39 23 10 1]]
I want to separate my NumPy array along rows that contain all zero values to create a set of smaller 2D arrays. The successful result for the above starting array would be the arrays:
[[0 8 9 10]
[9 9 1 4]]
[[1 2 1 4]]
[[1 1 1 2]
[39 23 10 1]]
I've thought about simply iterating down the array and checking if the row has all zeros but the data I'm handling is substantially large. I have potentially millions of rows of data in the text file and I'm trying to find the most efficient approach as opposed to a loop that could waste computation time. What are your thoughts on what I should do? Is there a better way?
a is your array. You can use any to find all zero rows, remove them, and then use split to split by their indices:
#not_all_zero rows indices
idx = np.flatnonzero(a.any(1))
#all_zero rows indices
idx_zero = np.delete(np.arange(a.shape[0]),idx)
#select not_all_zero rows and split by all_zero row indices
output = np.split(a[idx],idx_zero-np.arange(idx_zero.size))
output:
[array([[ 0, 8, 9, 10],
[ 9, 9, 1, 4]]),
array([[1, 2, 1, 4]]),
array([[ 1, 1, 1, 2],
[39, 23, 10, 1]])]
You can use the np.all function to check for rows which are all zeros, and then index appropriately.
# assume `x` is your data
indices = np.all(x == 0, axis=1)
zeros = x[indices]
nonzeros = x[np.logical_not(indices)]
The all function accepts an axis argument (as do many NumPy functions), which indicates the axis along which to operate. 1 here means to do the reduction along rows, so you get back a boolean array of shape (x.shape[0],), which can be used to directly index x.
Note that this will be much faster than a for-loop over the rows, especially for large arrays.

Subtracting minimum of row from the row

I know that
a - a.min(axis=0)
will subtract the minimum of each column from every element in the column. I want to subtract the minimum in each row from every element in the row. I know that
a.min(axis=1)
specifies the minimum within a row, but how do I tell the subtraction to go by rows instead of columns? (How do I specify the axis of the subtraction?)
edit: For my question, a is a 2d array in NumPy.
Assuming a is a numpy array, you can use this:
new_a = a - np.min(a, axis=1)[:,None]
Try it out:
import numpy as np
a = np.arange(24).reshape((4,6))
print (a)
new_a = a - np.min(a, axis=1)[:,None]
print (new_a)
Result:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
[[0 1 2 3 4 5]
[0 1 2 3 4 5]
[0 1 2 3 4 5]
[0 1 2 3 4 5]]
Note that np.min(a, axis=1) returns a 1d array of row-wise minimum values.
We than add an extra dimension to it using [:,None]. It then looks like this 2d array:
array([[ 0],
[ 6],
[12],
[18]])
When this 2d array participates in the subtraction, it gets broadcasted into a shape of (4,6), which looks like this:
array([[ 0, 0, 0, 0, 0, 0],
[ 6, 6, 6, 6, 6, 6],
[12, 12, 12, 12, 12, 12],
[18, 18, 18, 18, 18, 18]])
Now, element-wise subtraction happens between the two (4,6) arrays.
Specify keepdims=True to preserve a length-1 dimension in place of the dimension that min collapses, allowing broadcasting to work out naturally:
a - a.min(axis=1, keepdims=True)
This is especially convenient when axis is determined at runtime, but still probably clearer than manually reintroducing the squashed dimension even when the 1 value is fixed.
If you want to use only pandas you can just apply a lambda to every column using min(row)
new_df = pd.DataFrame()
for i, col in enumerate(df.columns):
new_df[col] = df.apply(lambda row: row[i] - min(row))

Select rows in a Numpy 2D array with a boolean vector

I have a matrix and a boolean vector:
>>>from numpy import *
>>>a = arange(20).reshape(4,5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
>>>b = asarray( [1, 1, 0, 1] ).reshape(-1,1)
array([[1],
[1],
[0],
[1]])
Now I want to select all the corresponding rows in this matrix where the corresponding index in the vector is equal to zero.
>>>a[b==0]
array([10])
How can I make it so this returns this particular row?
[10, 11, 12, 13, 14]
The shape of b is somewhat strange, but if you can craft it as a nicer index it's a simple selection:
idx = b.reshape(a.shape[0])
print a[idx==0,:]
>>> [[10 11 12 13 14]]
You can read this as, "select all the rows where the index is 0, and for each row selected take all the columns". Your expected answer should really be a list-of-lists since you are asking for all of the rows that match a criteria.
Nine years later, I just wanted to add another answer to this question in the case where b is actually a boolean vector.
Square bracket indexing of a Numpy matrix with scalar indices give the corresponding rows, so for example a[2] gives the third row of a. Multiple rows can be selected (potentially with repeats) using a vector of indices.
Similarly, logical vectors that have the same length as the number of rows act as "masks", for example:
a = np.arange(20).reshape(4,5)
b = np.array( [True, True, False, True] )
a[b] # 3x5 matrix formed with the first, second, and last row of a
To answer the OP specifically, the only thing to do from there is to negate the vector b:
a[ np.logical_not(b) ]
Lastly, if b is defined as in the OP with ones and zeros and a column shape, one would simply do: np.logical_not( b.ravel().astype(bool) ).

Categories