I am playing with the NumPy library and there is a behaviour that I do not understand:
I defined a simple vector Y:
Y = np.array([6,7,8],dtype = np.int16)
Then, I tried to add an additional element at the end of the array using the np.insert() method. I specified the position to be at -1:
Z = np.insert(Y,(-1),(19))
My element was inserted not on the last, but on the penultimate position:
Output
Could somebody explain my why this element was not inserted at the end of my array?
Negative indexes are calculated using len(l) + i (with l being the object and i being the negative index).
For example, with a list x = [1, 2, 3, 4] the index -2 is calculated as 4-2 = 2, so x[-2] returns 3
In this example, the index -1 is calculated as index 2
Therefore, 19 is inserted at index 2, the penultimate position. To fix this, use Z = np.insert(Y, len(Y), 19).
PS - this is the same with list.insert, not just with numpy arrays
Edit (suggested by Homer512): np.append may be easier than np.insert if you want to add to the end of an array: Z = np.append(Y, 19)
Related
I'd like to slice a numpy array to get all but the first item, unless there's only one element, in which case, I only want to select that element (i.e. don't slice).
Is there a way to do this without using an if-statement?
x = np.array([1,2,3,4,5])
y = np.array([1])
print(x[1:]) # works
print(y[1 or None:]) # doesn't work
I tried the above, but it didn't work.
A way to write that without a conditional is to use negative indexing with -len(arr) + 1:
>>> x = np.array([1,2,3,4,5])
>>> y = np.array([1])
>>> x[-len(x)+1:]
array([2, 3, 4, 5])
>>> y[-len(y)+1:]
array([1])
If the array has N elements where N > 1, slice becomes -N+1:. Since -N+1 < 0, it is effectively (N + (-N + 1)): === 1:, i.e, first one onwards.
Elsewhen N == 1, slice is 0:, i.e., take the first element onwards which is the only element.
Because of how slicing works, an empty array (i.e., N = 0 case) will result in an empty array too.
You can just write an if / else:
x[1 if len(x) > 1 else 0:]
array([2, 3, 4, 5])
y[1 if len(y) > 1 else 0:]
array([1])
Or:
y[int(len(y) > 1):]
array([1])
x[int(len(x) > 1):]
array([2, 3, 4, 5])
Just move None out of the brackets
x = [1,2,3,4,5]
y = [1]
print(x[1:])
print(y[1:] or None)
Use a ternary expression to make the logic clear but still be able to use it as a function argument or inside other expressions.
The ternary expression x if len(x) == 1 else x[1:] works and is very clear. And you can use it as a parameter in a function call, or in a larger expression.
E.g.:
>>> x = np.array([1,2,3,4,5])
>>> y = np.array([1])
>>> print(x if len(x) == 1 else x[1:])
[2 3 4 5]
>>> print(y if len(y) == 1 else y[1:])
[1]
Musings on other solutions
I'm not sure if you're looking for the most concise code possible to do this, or just for the ability to have the logic inside a single expression.
I don't recommend fancy slicing solutions with negative indexing, for the sake of legibility of your code. Think about future readers of your code, even yourself in a year or two.
Using this is a larger expression
In the comments, you mention you need a solution that can be incorporated into something like a comprehension. The ternary expression can be used as is within a comprehension. For example, this code works:
l = [np.array(range(i)) for i in range(5)]
l2 = [
x if len(x) == 1 else x[1:]
for x in l
]
I've added spacing to make the code easier to read, but it would also work as a one liner:
l2 = [x if len(x) == 1 else x[1:] for x in l]
EDIT note
Earlier, I thought you wanted the first element extracted from the list in the single-element case, i.e., x[0], but I believe you actually want that single-element list unsliced, i.e., x, so I've updated my answer accordingly.
Check the array size is greater than 1 if the case you can delete the first element from array and it will give new array without it.
print(np.delete(x, 0))
Now you can get a new array which will contain only remaining items other than first.
I'm trying to access the neighbouring elements (above, below, left and right) in a square 2D list. When the element I'm looking at is on an 'edge' of the 2D list, however, it will try to access a list index that doesn't exist. Here's the code I'm using:
surroundings = [
my_2D_array[currentY+1][currentX],
my_2D_array[currentY-1][currentX],
my_2D_array[currentY][currentX+1],
my_2D_array[currentY][currentX-1]
]
How do I get it to 'roll over', so in a list l with 3 items, instead of throwing an error accessing l[3], it would simply access l[0] instead?
The best way to perform a 'roll over', or 'wrap around' as I'd say, is to use modulus:
>>> x = [1, 2, 3]
>>> x[3 % len(x)]
1
>>> 3 % len(x) # 3 % 3 is 0
0
If you are 100% sure the length of the list is constant, simply hard-code the modulus right-hand-side value into your code:
x[index % 3]
This is because you could describe modulus as removing as many multiples of the RHS number from the LHS one, returning the vale left over. So, x % y returns the remainder after (floor) dividing x by y.
I have a list of points and I want to keep the points of the list only if the distance between them is greater than a certain threshold. So, starting from the first point, if the the distance between the first point and the second is less than the threshold then I would remove the second point then compute the distance between the first one and the third one. If this distance is less than the threshold, compare the first and fourth point. Else move to the distance between the third and fourth and so on.
So for example, if the threshold is 2 and I have
list = [1, 2, 5, 6, 10]
then I would expect
new_list = [1, 5, 10]
Thank you!
Not a fancy one-liner, but you can just iterate the values in the list and append them to some new list if the current value is greater than the last value in the new list, using [-1]:
lst = range(10)
diff = 3
new = []
for n in lst:
if not new or abs(n - new[-1]) >= diff:
new.append(n)
Afterwards, new is [0, 3, 6, 9].
Concerning your comment "What if i had instead a list of coordinates (x,y)?": In this case you do exactly the same thing, except that instead of just comparing the numbers, you have to find the Euclidean distance between two points. So, assuming lst is a list of (x,y) pairs:
if not new or ((n[0]-new[-1][0])**2 + (n[1]-new[-1][1])**2)**.5 >= diff:
Alternatively, you can convert your (x,y) pairs into complex numbers. For those, basic operations such as addition, subtraction and absolute value are already defined, so you can just use the above code again.
lst = [complex(x,y) for x,y in lst]
new = []
for n in lst:
if not new or abs(n - new[-1]) >= diff: # same as in the first version
new.append(n)
print(new)
Now, new is a list of complex numbers representing the points: [0j, (3+3j), (6+6j), (9+9j)]
While the solution by tobias_k works, it is not the most efficient (in my opinion, but I may be overlooking something). It is based on list order and does not consider that the element which is close (within threshold) to the maximum number of other elements should be eliminated the last in the solution. The element that has the least number of such connections (or proximities) should be considered and checked first. The approach I suggest will likely allow retaining the maximum number of points that are outside the specified thresholds from other elements in the given list. This works very well for list of vectors and therefore x,y or x,y,z coordinates. If however you intend to use this solution with a list of scalars, you can simply include this line in the code orig_list=np.array(orig_list)[:,np.newaxis].tolist()
Please see the solution below:
import numpy as np
thresh = 2.0
orig_list=[[1,2], [5,6], ...]
nsamp = len(orig_list)
arr_matrix = np.array(orig_list)
distance_matrix = np.zeros([nsamp, nsamp], dtype=np.float)
for ii in range(nsamp):
distance_matrix[:, ii] = np.apply_along_axis(lambda x: np.linalg.norm(np.array(x)-np.array(arr_matrix[ii, :])),
1,
arr_matrix)
n_proxim = np.apply_along_axis(lambda x: np.count_nonzero(x < thresh),
0,
distance_matrix)
idx = np.argsort(n_proxim).tolist()
idx_out = list()
for ii in idx:
for jj in range(ii+1):
if ii not in idx_out:
if self.distance_matrix[ii, jj] < thresh:
if ii != jj:
idx_out.append(jj)
pop_idx = sorted(np.unique(idx_out).tolist(),
reverse=True)
for pop_id in pop_idx:
orig_list.pop(pop_id)
nsamp = len(orig_list)
I'm trying to sort an array where it starts from the second number and looks at the one before it to see if the previous number is larger. If it is, I want to swap the numbers, otherwise keep the number where it is. Currently my code does not do that. When I input the array below the only thing that changes is that the 2 becomes an 11, giving me two 11's in the middle. What is going wrong?
#given an array of digits a of length N
a = [7, 3, 11, 2, 6, 16]
N = len(a)
# moving forward along a starting from the second position to the end
# define _sillysort(a, start_pos):
# set position = start_pos
# moving backwards along a from start_pos:
# if the a[position-1] is greater than a[position]:
# swap a[position-1] and a[position]
def sillysort(a, start_pos):
a_sorted = []
start_pos = a[1]
for position in a:
if a[start_pos-1] >= a[start_pos]:
a[start_pos-1], a[start_pos] = a[start_pos], a[start_pos-1]
else:
a[start_pos-1] = a[start_pos]
a_sorted.append(position)
position += 1
return a_sorted
When I run this, sillysort(a, N), I get this output [7, 3, 11, 11, 6, 16].
Your code has a couple of issues
start_pos = a[1]
If you are already providing start_pos as an argument to your function, why are you reinitialising it in the function. Moreover, if a is the array you want to sort, why is the start_pos of your algorithm the second element of the array a itself?
for position in a:
if a[start_pos-1] >= a[start_pos]:
a[start_pos-1], a[start_pos] = a[start_pos], a[start_pos-1]
else:
a[start_pos-1] = a[start_pos]
a_sorted.append(position)
position += 1
The for in loop will iterate through the array a and position will take the values of the elements of the array. In your example position will take the values in the following order:
7, 3, 11, 2, 6, 16
What I don't understand is why are you incrementing position by 1 at the end of the the for loop. And once again, you are using the values inside the array to index the array rather than the indices themselves.
Since in your example, start_pos will take the value a[1] i.e. 3, your code compares a[3] and a[2] i.e. 2 and 11 and goes into the else condition and makes a[3] = a[2] and hence you get 11 in the position of 2
You have probably confused yourself with the variable names. See if this helps you.
I'm iterating over a NumPy (nd) array of OpenCV lines. I want to remove all lines which are outwith 8 degrees of vertical. I realise the numpy array is immutable and what I'm doing in the code is not right but it demonstrates the idea of what I'm trying to do;
index = 0
for line in self.lines[0]:
if (line[1]*180)/np.pi > 8:
self.lines[0] = np.delete(self.lines[0], index, axis=0)
index+=1
How can I go about removing these NumPy array indexes?
Thanks!
You cannot delete array indexes while iterating over that array. It will give wrong results. Say you are at iteration 5 and this index satisfies if condition and needs to be deleted, but if we delete this it will cause array element at index 6 to come at index 5 and next element selected will be 7 which comes to index 6 after deletion and thus condition is never checked for element initially at index 6 i.e., before deletion.
Basic idea to deal with this issue is to append these indices to a list and delete them outside lope. So your code is:
index = 0
idx = []
for line in self.lines[0]:
if (line[1]*180)/np.pi > 8:
idx.append(index)
index+=1
self.lines[0] = np.delete(self.lines[0], idx, axis=0)
I guess you are missing indentation in your code, so I interpret it like this:
index = 0
for line in self.lines[0]:
if (line[1]*180)/np.pi > 8:
self.lines[0] = np.delete(self.lines[0], index, axis=0)
index+=1
You can do that of course without a loop. First we do the check to build the boolean indexing array (True/False values for every item in the array):
index = self.lines[0,:,1]*180/np.pi > 8
Then we select the lines, where the condition is false:
a[~a%2==0]
Where np.nonzero converts from a boolean array to an array, where all indices with True-values are listed.
For example:
>>> a = np.arange(10)
>>> a[~a%2==0]
array([1, 3, 5, 7, 9])