Avoiding an indexing error in Python while looping - python

Regardless of whether this is the most efficient way to structure this sorting algorithm in Python (it's not), my understandings of indexing requirements/the nature of the built-in 'min' function are failing to account for the following error in the following code:
Error:
builtins.IndexError: list index out of range
Here's the code:
#Create function to sort arrays with numeric entries in increasing order
def selection_sort(arr):
arruns = arr #pool of unsorted array values, initially the same as 'arr'
indmin = 0 #initialize arbitrary value for indmin.
#indmin is the index of the minimum value of the entries in arruns
for i in range(0,len(arr)):
if i > 0: #after the first looping cycle
del arruns[indmin] #remove the entry that has been properly sorted
#from the pool of unsorted values.
while arr[i] != min(arruns):
indmin = arruns.index(min(arruns)) #get index of min value in arruns
arr[i] = arruns[indmin]
#example case
x = [1,0,5,4] #simple array to be sorted
selection_sort(x)
print(x) #The expectation is: [0,1,4,5]
I've looked at a couple other index error examples and have not been able to attribute my problem to anything occurring while entering/exiting my while loop. I thought that my mapping of the sorting process was sound, but my code even fails on the simple array assigned to x above. Please help if able.

arr and arruns are the same lists. You are removing items from the list, decreasing its size, but leaving max value of i variable untouched.
Fix:
arruns = [] + arr
This will create new array for arruns

Related

How to remove numbers in an array if it exists in another another

Here is my code so far. (Using NumPy for arrays)
avail_nums = np.array([1,2,3,4,5,6,7,8,9]) # initial available numbers
# print(avail_nums.shape[0])
# print(sudoku[spaces[x,1],spaces[x,2]]) # index of missing numbers in sudoku
print('\n')
# print(sudoku[spaces[x,1],:]) # rows of missing numbers
for i in range(sudoku[spaces[x,1],:].shape[0]): # Number of elements in the missing number row
for j in range(avail_nums.shape[0]): # Number of available numbers
if(sudoku[spaces[x,1],i] == avail_nums[j]):
avail_nums= np.delete(avail_nums,[j])
print(avail_nums)
A for loop cycles through all the elements in the 'sudoku row' and nested inside, another loop cycles through avail_nums. Every time there is a match (given by the if statement), that value is to be deleted from the avail_nums array until finally all the numbers in 'sudoku row' aren't in avail_nums.
I'm greeted with this error:
IndexError: index 8 is out of bounds for axis 0 with size 8
pointing to the line with the if statement.
Because avail_nums is shrinking, after the first deletion this happens. How can I resolve this issue?
When you are deleting items from the array, the array is getting smaller but your for loop does not know that because it is iterating over the original size of the array. So you are getting an out of bound error. So I would avoid using the for loop and deleting from the array I am iterating over.
My solution is to use a temporary array that contains allowed elements and then assign it to the original array name
temporary_array=list()
for element in array:
If element in another_array: # you can do this in Python
continue # ignore it
temporary_array.append(element)
array=temporary_array
the resulting array will have only the elements that do not exist in the another_array
You could also use list comprehension:
temporary_array = [ element for element in array if element not in another_array ]
array = temporary_array
Which is the same concept using fancy python syntax
Another option would be to use the builtin filter() which takes a filter function and an array and returns the filtered array. In the following I am using the lambda function notation, which is another nice Python syntax:
array = filter(lambda x: x not in another_array, array)
Since you are using numpy you should look for the numpy.extract() method here https://numpy.org/doc/stable/reference/generated/numpy.extract.html... for example using, numpy.where(), numpy.in1d() and numpy.extract() we could:
condition = numpy.where(numpy.in1d(np_array, np_another_array),False,True)
np_array = numpy.extract(condition, np_array)

Why no index out of range error?

lst = [
8,2,22,97,38,15,0,40,0,75,4,5,7,78,52,12,50,77,91,8, #0-19
49,49,99,40,17,81,18,57,60,87,17,40,98,43,69,48,4,56,62,0, #20-39
81,49,31,73,55,79,14,29,93,71,40,67,53,88,30,3,49,13,36,65, #40-59
52,70,95,23,4,60,11,42,69,24,68,56,1,32,56,71,37,2,36,91,
22,31,16,71,51,67,63,89,41,92,36,54,22,40,40,28,66,33,13,80,
24,47,32,60,99,3,45,2,44,75,33,53,78,36,84,20,35,17,12,50,
32,98,81,28,64,23,67,10,26,38,40,67,59,54,70,66,18,38,64,70,
67,26,20,68,2,62,12,20,95,63,94,39,63,8,40,91,66,49,94,21,
24,55,58,5,66,73,99,26,97,17,78,78,96,83,14,88,34,89,63,72,
21,36,23,9,75,0,76,44,20,45,35,14,0,61,33,97,34,31,33,95,
78,17,53,28,22,75,31,67,15,94,3,80,4,62,16,14,9,53,56,92,
16,39,5,42,96,35,31,47,55,58,88,24,0,17,54,24,36,29,85,57,
86,56,0,48,35,71,89,7,5,44,44,37,44,60,21,58,51,54,17,58,
19,80,81,68,5,94,47,69,28,73,92,13,86,52,17,77,4,89,55,40,
4,52,8,83,97,35,99,16,7,97,57,32,16,26,26,79,33,27,98,66,
88,36,68,87,57,62,20,72,3,46,33,67,46,55,12,32,63,93,53,69,
4,42,16,73,38,25,39,11,24,94,72,18,8,46,29,32,40,62,76,36, #320-339
20,69,36,41,72,30,23,88,34,62,99,69,82,67,59,85,74,4,36,16, #340-359
20,73,35,29,78,31,90,1,74,31,49,71,48,86,81,16,23,57,5,54, #360-379
1,70,54,71,83,51,54,69,16,92,33,48,61,43,52,1,89,19,67,48] #380-399
prodsum = 1
def prod(iter):
p = 1
for n in iter:
p *= n
return p
for n in range(0,5000,20): #NOT OUT OF RANGE???
for i in range(0,17):
if prod(lst[n+i:n+i+4]) > prodsum:
prodsum = prod(lst[n+i:n+i+4])
I'm trying to learn/improve my very rudimentary skills in Python so I've been going through Project Euler challenges. The challenge question is more complex but I basically have a 20x20 grid and have the find 4 adjacent numbers with the largest product.
I basically turned the grid into a list (with 400 values) and was gonna scan row indices. I accidentally entered in a large number for my for loop and noticed I didn't get a out of range error. Why is this?
You would get an out-of-range error with plain indexing. Eg, if you had an array of 10 elements, and you asked for my_list[20]. However, with slicing my_array[a: b] you either get elements from a to b-1, or to the end of the list. That's just a design decision of the language.
You don't get a out-of-range as you never directly access your list based on your index (n).
You use lst[n+i:n+i+4] to get a slice of lst ... which just is empty if your indices are out of range so prod(...) is called with [] and returns 1.
Slicing outside the bounds of a sequence doesn't cause an error. If you try to index single item, you'll got an error.

Subtracting a number from an array if condition is met python

I am facing a very basic problem in if condition in python.
The array current_img_list is of dimension (500L,1). If the number 82459 is in this array, I want to subtract it by 1.
index = np.random.randint(0,num_train-1,500)
# shape of current_img_list is (500L,1)
# values in index array is randomized
current_img_list = train_data['img_list'][index]
# shape of current_img_list is (500L,1)
if (any(current_img_list) == 82459):
i = np.where(current_img_list == 82459)
final_list = i-1
Explanation of variables - train_data is of type dict. It has 4 elements in it. img_list is one of the elements with size (215375L,1). The value of num_train is 215375 and size of index is (500L,1)
Firsly I don't know whether this loop is working or not. I tried all() function and numpy.where() function but to no success. Secondly, I can't think of a way of how to subtract 1 from 82459 directly from the index at which it is stored without affecting the rest of the values in this array.
Thanks
Looping over the array in Python will be much slower than letting numpy's vectorized operators do their thing:
import numpy as np
num_train = 90000 # for example
index = np.random.randint(0,num_train-1,500)
index[index == 82459] -= 1
current_img_list = np.array([1,2,3,82459,4,5,6])
i = np.where(current_img_list == 82459)
current_img_list[i] -= 1
I'm a little confused by whats trying to be achieved here, but I'll give it a go:
If you're trying to subtract 1 from anywhere in your array that is equal to 82459, then what maybe iterate through the array, with a for loop. Each time the current index is equal to 82459, just set the number at that index -= 1.
If you need more help, please post the rest of the relevant code so I can debug.

'float' object does not support item deletion

I am trying to run through a list and delete elements that do not meet a certain threshold but i am receiving error 'float' object does not support item deletion when I try to delete.
Why am i getting this error? Is there anyway to delete items from lists like this for floats?
Relevant Code:
def remove_abnormal_min_max(distances, avgDistance):
#Define cut off for abnormal roots
cutOff = 0.20 * avgDistance # 20 percent of avg distance
for indx, distance in enumerate(distances): #for all the distances
if(distance <= cutOff): #if the distance between min and max is less than or equal to cutOff point
del distance[indx] #delete this distance from the list
return distances
Your list of float values is called distances (plural), each individual float value from that sequence is called distance (singular).
You are trying to use the latter, rather than the former. del distance[indx] fails because that is the float value, not the list object.
All you need to do is add the missing s:
del distances[indx]
# ^
However, now you are modifying the list in place, shortening it as you loop. This'll cause you to miss elements; items that were once at position i + 1 are now at i while the iterator happily continues at i + 1.
The work-around to that is to build a new list object with everything you wanted to keep instead:
distances = [d for d in distances if d > cutOff]
You mentioned in your comment that you need to reuse the index of the deleted distance. You can build a list of all the indxs you need at once using a list comprehension:
indxs = [k for k,d in enumerate(distances) if d <= cutOff]
And then you can iterate over this new list to do the other work you need:
for indx in indxs:
del distances[indx]
del otherlist[2*indx, 2*indx+1] # or whatever
You may also be able to massage your other work into another list comprehension:
indxs = [k for k,d in enumerate distances if d > cutOff] # note reversed logic
distances = [distances[indx] for indx in indxs] # one statement so doesn't fall in the modify-as-you-iterate trap
otherlist = [otherlist[2*indx, 2*indx+1] for indx in indxs]
As an aside, if you are using NumPy, which is a numerical and scientific computing package for Python, you can take advantage of boolean arrays and what they call smart indexing and use indxs directly to access your list:
import numpy as np
distances = np.array(distances) # convert to a numpy array so we can use smart indexing
keep = ~(distances > cutOff)
distances = distances[keep] # this won't work on a regular Python list

Making a index array from an array in numpy

Good morning experts,
I have an array which contain integer numbers, and I have a list with the unique values that are in the array sorted in special order. What I want is to make another array which will contain the indexes of each value in the a array.
#a numpy array with integer values
#size_x and size_y: array dimensions of a
#index_list contain the unique values of a sorted in a special order.
#b New array with the index values
for i in xrange(0,size_x):
for j in xrange(0,size_y):
b[i][j]=index_list.index(a[i][j])
This works but it takes long time to do it. Is there a faster way to do it?
Many thanks for your help
German
The slow part is the lookup
index_list.index(a[i][j])
It will be much quicker to use a Python dictionary for this task, ie. rather than
index_list = [ item_0, item_1, item_2, ...]
use
index_dict = { item_0:0, item_1:1, item_2:2, ...}
Which can be created using:
index_dict = dict( (item, i) for i, item in enumerate(index_list) )
Didn't try, but as this is pure numpy, it should be much faster then a dictionary based approach:
# note that the code will use the next higher value if a value is
# missing from index_list.
new_vals, old_index = np.unique(index_list, return_index=True)
# use searchsorted to find the index:
b_new_index = np.searchsorted(new_vals, a)
# And the original index:
b = old_index[b_new_index]
Alternatively you could simply fill any wholes in index_list.
Edited code, it was as such quite simply wrong (or very limited)...

Categories