Iterating two variables over a single list

Iterating two variables over a single list - python

I am trying to write a program which iterates i,j and k,and finds the minimum value given a certain formula (vik + vkj - vij), where v is a 2d list of distances between points, k is a new point two be inserted into the new array and i,j are existing values in the new array.
Sorry if this explanation is a little confusing...
My code is this:
values = [[0,2],[3,3],[4,5],[2,1],[7,1]]
points = [0,1,2,3,4]
new = [2,4]
for k in points: #k is the point that will be inserted
minVal = 1000000000000000 #set to any arbitrarily high value, that will be larger than any other distance
for i,j in new:
nextVal = values[i][k] + values[k][j] - values[i][j] # finds value which minimises vik + vkj - vij
if nextVal < minVal:
minVal = nextVal
idx = i #saves index of i,j that gave minimal value, so that k can be inserted between these
jdx = j
new.insert(idx + 1, k) #insert after idx or before jdx
Anyway the probem is I get:
for i,j in new:
TypeError: 'int' object is not iterable
I read somewhere that this is because objects of type int can't be iterated, but I don't get how else to solve this.
How can I have two separate values iterate through a list of ints, while making sure I remember which two values of i,j gave the minimum value, so I can then add k in between them?

I don't fully understand the description of what you're aiming to do, but I can solve this error for you. It's in the line of code:
for i,j in new:
You have defined new as the list [2,4]. So when you do for i,j in [2,4], Python automatically unpacks the list and now you have for i,j in 2,4, which of course can't be iterated. It is forbidden to do something like:
for i in 2:

Related

List Problem in python to sort a list of string without using sort or sorted function

I am stuck in this list problem, I am unable to solve it.
list1= ["aaditya-2", "rahul-9", "shubhangi-4"]
I need to sort this list without using sort/sorted function.... and also it should sort on the basis of numbers at the last..
Output:
["aaditya-2", "shubhangi-4", "rahul-9"]
OR
["rahul-9", "shubhangi-4", "aaditya-2" ]

You could try something like this:
def sort_arr(arr, function):
arr2 = [function(i) for i in arr] # apply your function to all the elements in the array, in this case it is get the last digit
for i in range(1, len(arr2)): # begin looping from the first index of the array to the last index
k = arr2[i] # assign the value of the ith item to a variable
j = i - 1 # keep track of the index begin considered
while j >= 0 and k < arr2[j]: # while k is less than the item before it, do
arr2[j + 1] = arr2[j] # shift the ith value one step to the front of the array
j -= 1
arr2[j + 1] = k # once the correct position has been found, position arr2[i] at its correct location
return arr2 # return sorted array
print(sort_arr(list1, lambda x: int(x[-1]))) # the lambda gets the last digit and turns it into an integer
Learn more about lambda's here.
This is an implementation of insertion sort. Learn more about it here.
Hope this helps!
EDIT: Just realized this did not directly answer OP's question. Updated answer to match.

As I understand it, it's a question of ordering. I don't think this is about implementing a sorting algorithm. Here I define a order function that works on two strings and orders them by last character. This is applied 3 times to sort 3-element list.
def order(a, b):
return (b, a) if a[-1] > b[-1] else (a, b)
list1 = ["aaditya-2", "rahul-9", "shubhangi-4"]
a, b, c = list1
a, b = order(a, b)
a, c = order(a, c)
b, c = order(b, c)
list1 = [a, b, c] # ["aaditya-2", "shubhangi-4", "rahul-9"]
For reverse sort, just use < in the order function.

I cannot really tell what is out of range

I keep getting this error and I don't even know what is wrong, so what happens is I get some random indexes from the array temp which holds only integers from 0 to the len(students_grades) after that I go to the students_grades and get the value of the indexes I just got and store it in the object called cluster-> have two attributes (centroid and individuals)
What I want to do is the following, I want to generate some random indexes from the array temp and then take those indexes and go get their values from the array students_grades and then I want to remove that index from students_grades ..can someone help?
data = pd.read_csv("CourseEvaluation.csv", header=None)
students_grades = []
for i in range(1, 151):
students_grades.append([float(data.values[i, j]) for j in range(1, 21)])
k = int(input("enter how many clusters :"))
indices = numpy.random.choice(temp, k, False)
initial_clusters = []
for i in range(0, len(indices)):
print("product number:", indices[i] + 1)
cluster = Cluster(students_grades[indices[i]],
students_grades[indices[i]])
students_grades.pop(indices[i])
initial_clusters.append(cluster)
Error:
Traceback (most recent call last):
File "", line 103, in <module>
cluster = Cluster(students_grades[indices[i]], students_grades[indices[i]])
IndexError: list index out of range

You might want to reorganise your code a bit ;-)
Any way, I can't be a hundred percent sure, but I'm guessing your problem is the following: you're making an array of random indices, named indices, with potentially largest value temp - 1, which might occur at any point in that array (but only once):
indices = numpy.random.choice(temp, k, False)
Next you loop over those indices, and at every step you're reducing the size of your students_grades list:
students_grades.pop(indices[i])
So assuming that temp = len(students_grades) at the start of this, after i steps the length of students_grades is only temp - i, but the index you are getting from your indexes array can be as high as temp - 1 so you can get index out of bound errors.
To remedy this, remember that
indices = numpy.random.choice(temp, k, False)
means that you won't get the same index twice, so it isn't necessary to remove the value at the index from students_grades.
BTW just some general python style advise: instead of
for i in range(len(some_list)):
stuff with some_list[i]
you can use
for element in some_list:
stuff with element
for more readable, 'pythonic', code ;-)

i want to find out the index of the elements in an array of duplicate elements

a=[2, 1, 3, 5, 3, 2]
def firstDuplicate(a):
for i in range(0,len(a)):
for j in range(i+1,len(a)):
while a[i]==a[j]:
num=[j]
break
print(num)
print(firstDuplicate(a))
The output should be coming as 4 and 5 but it's coming as 4 only

You can find the indices of all duplicates in an array in O(n) time and O(1) extra space with something like the following:
def get_duplicate_indices(arr):
inds = []
for i, val in enumerate(arr):
val = abs(val)
if arr[val] >= 0:
arr[val] = -arr[val]
else:
inds.append(i)
return inds
get_duplicate_indices(a)
[4, 5]
Note that this will modify the array in place! If you want to keep your input array un-modified, replace the first few lines in the above with:
def get_duplicate_indices(a):
arr = a.copy() # so we don't modify in place. Drawback is it's not O(n) extra space
inds = []
for i, val in enumerate(a):
# ...
Essentially this uses the sign of each element in the array as an indicator of whether a number has been seen before. If we come across a negative value, it means the number we reached has been seen before, so we append the number's index to our list of already-seen indices.
Note that this can run into trouble if the values in the array are larger than the length of the array, but in this case we just extend the working array to be the same length as whatever the maximum value is in the input. Easy peasy.

There are some things wrong with your code. The following will collect the indexes of every first duplicate:
def firstDuplicate(a):
num = [] # list to collect indexes of first dupes
for i in range(len(a)-1): # second to last is enough here
for j in range(i+1, len(a)):
if a[i]==a[j]: # while-loop made little sense
num.append(j) # grow list and do not override it
break # stop inner loop after first duplicate
print(num)
There are of course more performant algorithms to achieve this that are not quadratic.

Excluding values in array - python

Ok so I have an array in python. This array holds indices to another array. I removed the indices I wanted to keep from this array.
stations = [1,2,3]
Let's call x the main array. It has 5 columns and I removed the 1st and 5th and put the rest in the array called stations.
I want to be able to create an if statement where the values from stations are excluded. So I'm just trying to find the number of instances (days) where the indices in the stations array are 0 and the other indices (0 and 4) are not 0.
How do I go about doing that? I have this so far, but it doesn't seem to be correct.
for j in range(len(x)):
if x[j,0] != 0 and x[j,4] != 0 and numpy.where(x[j,stations[0]:stations[len(stations)-1]]) == 0:
days += 1
return days

I don't think your problem statement is very clear, but if you want the x cols such that you exclude the indices contained in stations then do this.
excluded_station_x = [col for i, col in enumerate(x) if i not in stations]
This is a list comprehension, its a way for building a new list via transversing an iterable. Its the same as writing
excluded_station_x = []
for i, col in enumerate(x):
if i not in stations:
excluded_station_x.append(col)
enumerate() yields both the value and index of each element as we iterate through the list.
As requested, I will do it without enumerate.
You could also just del each of the bad indices, although I dislike this because it mutates the original list.
for i in stations:
del x[i]

Removing points from list if distance between 2 points is below a certain threshold

I have a list of points and I want to keep the points of the list only if the distance between them is greater than a certain threshold. So, starting from the first point, if the the distance between the first point and the second is less than the threshold then I would remove the second point then compute the distance between the first one and the third one. If this distance is less than the threshold, compare the first and fourth point. Else move to the distance between the third and fourth and so on.
So for example, if the threshold is 2 and I have
list = [1, 2, 5, 6, 10]
then I would expect
new_list = [1, 5, 10]
Thank you!

Not a fancy one-liner, but you can just iterate the values in the list and append them to some new list if the current value is greater than the last value in the new list, using [-1]:
lst = range(10)
diff = 3
new = []
for n in lst:
if not new or abs(n - new[-1]) >= diff:
new.append(n)
Afterwards, new is [0, 3, 6, 9].
Concerning your comment "What if i had instead a list of coordinates (x,y)?": In this case you do exactly the same thing, except that instead of just comparing the numbers, you have to find the Euclidean distance between two points. So, assuming lst is a list of (x,y) pairs:
if not new or ((n[0]-new[-1][0])**2 + (n[1]-new[-1][1])**2)**.5 >= diff:
Alternatively, you can convert your (x,y) pairs into complex numbers. For those, basic operations such as addition, subtraction and absolute value are already defined, so you can just use the above code again.
lst = [complex(x,y) for x,y in lst]
new = []
for n in lst:
if not new or abs(n - new[-1]) >= diff: # same as in the first version
new.append(n)
print(new)
Now, new is a list of complex numbers representing the points: [0j, (3+3j), (6+6j), (9+9j)]

While the solution by tobias_k works, it is not the most efficient (in my opinion, but I may be overlooking something). It is based on list order and does not consider that the element which is close (within threshold) to the maximum number of other elements should be eliminated the last in the solution. The element that has the least number of such connections (or proximities) should be considered and checked first. The approach I suggest will likely allow retaining the maximum number of points that are outside the specified thresholds from other elements in the given list. This works very well for list of vectors and therefore x,y or x,y,z coordinates. If however you intend to use this solution with a list of scalars, you can simply include this line in the code orig_list=np.array(orig_list)[:,np.newaxis].tolist()
Please see the solution below:
import numpy as np
thresh = 2.0
orig_list=[[1,2], [5,6], ...]
nsamp = len(orig_list)
arr_matrix = np.array(orig_list)
distance_matrix = np.zeros([nsamp, nsamp], dtype=np.float)
for ii in range(nsamp):
distance_matrix[:, ii] = np.apply_along_axis(lambda x: np.linalg.norm(np.array(x)-np.array(arr_matrix[ii, :])),
1,
arr_matrix)
n_proxim = np.apply_along_axis(lambda x: np.count_nonzero(x < thresh),
0,
distance_matrix)
idx = np.argsort(n_proxim).tolist()
idx_out = list()
for ii in idx:
for jj in range(ii+1):
if ii not in idx_out:
if self.distance_matrix[ii, jj] < thresh:
if ii != jj:
idx_out.append(jj)
pop_idx = sorted(np.unique(idx_out).tolist(),
reverse=True)
for pop_id in pop_idx:
orig_list.pop(pop_id)
nsamp = len(orig_list)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Iterating two variables over a single list - python

Related

List Problem in python to sort a list of string without using sort or sorted function

I cannot really tell what is out of range

i want to find out the index of the elements in an array of duplicate elements

Excluding values in array - python

Removing points from list if distance between 2 points is below a certain threshold

Categories

Resources