Get the largest absolute difference in an array Python - python

I want to do a for loop that can basically do the absolute difference between every 2 elements of the array until it reaches all of them and then print the highest absolute difference.
arr = []
n = int(input("number of elements in array: "))
for i in range(0, n):
arr.append(input('insert element: '))
I've done this already but I would like to know how slow this method is compared to making the absolute difference between the
first and last element after sorting the array.
EXAMPLE
Input: {2, 7, 3, 4, 1, 9}
Output: 8 (|1 – 9|)
This is what I have tried:
arr = []
n = int(input("número de elementos do array : "))
for i in range(0, n):
arr.append(int(input('escreva os elementos: ')))
arr.sort()
print(arr[-1] - arr[0])

If you are fine with numpy, there's a way to do so.
Firstly, you need to find all the possible non-duplicate solutions from the given input using itertools.combinations
from itertools import combinations
alist = [2, 7, 3, 4, 1, 9]
all_comb = list(combinations(alist, 2))
[(2, 7), (2, 3), (2, 4), (2, 1), (2, 9), (7, 3), (7, 4), (7, 1), (7, 9), (3, 4), (3, 1), (3, 9), (4, 1), (4, 9), (1, 9)]
With this, you can use np.diff to find the differences for every tuple.
abs_diff = abs(np.diff(all_comb)).flatten()
array([5, 1, 2, 1, 7, 4, 3, 6, 2, 1, 2, 6, 3, 5, 8])
Finally, you can get the index of the maximum difference using np.argmax.
all_comb[abs_diff.argmax()]
Out[147]: (1, 9)

arr = []
n = int(input("número de elementos do array : "))
my_min, my_max = None, None
for i in range(0, n):
arr.append(int(input('escreva os elementos: ')))
if my_min is None or abs(arr[i]) < my_min:
my_min = arr[i]
if my_max is None or abs(arr[i]) > my_max:
my_max = arr[i]
print(f"{abs(my_max - my_min)} (|{my_max} - {my_min}|)")
You can achieve this just by "emembering" the number with highest and lowest abs value

As I understood about your query.
Sorting the element and then comparing first & last one is much faster than finding highest difference via iterating through list. This is because when sorting happens internally, as it moves forward it needs to compare with less values because if value is higher than last sorted value it directly appends next to it but if value is less then it just moves one value back rather than starting over again.
But comparing through all possible pairs in list takes much more time as it has to start from first value over again since we don't know which comparing will be highest.
So sorting is much faster to find largest difference than iterating for every possible pair with for loop in list.
I hope I got your query right :)
UPDATED
so the main question is about finding a way to find largest diff in a list with for loop and which should be faster so here it is.
In my opinion below code will be even faster than sorting and finding largest diff. Because here in this code we only need to iterate in list once and we will have answer of largest diff. No need to iterate every possible pair of value.
I think this may help :)
list_a = []
n = int(input("number of elements in array: "))
for i in range(0, n):
# store input after converting to integer.
list_a.append(int(input('insert element: ')))
'''to store largest difference of two current numbers in every eteration'''
largest_diff_so_far = 0;
'''list to store that two numbers we are comparing'''
actual_diff_number = None;
'''start from first number in list. we don't need to go through every possible pair so just picking first number without for loop.'''
first = list_a[0]
'''here we iterate through all number only once till last number in
list'''
for second in list_a :
'''first find diff of current two value'''
current_diff = second - first
'''as we can see when current_diff is larger then previous largest diff we will update their value'''
if largest_diff_so_far == 0 or current_diff > largest_diff_so_far:
'''if first value in list is largest than all then the current diff will be negative and in that case we will run below if code and continue the code so that it will not over -ride anything in remaining code'''
if current_diff < 0:
''' since the diff is negative we will store its absolute value in largest diff variable.'''
largest_diff_so_far = abs(current_diff)
''' since first value is largest then all means it is larger than current second also, so in actual_diff_number we will store values in reverse order, so that our largest value which is stored in first variable will be second in list and by this in later iteration we will avoid over-writing of this largest value'''
actual_diff_number = [second, first]
''' we will also update first variable's value to second variable's value since it smaller than previous value of first and by this next iteration will use this value for diff rather than initial value of first variable which was largest.'''
first = second
continue
'''if above condition is not the case than rest of the below code will run'''
'''largest diff will be current_diff'''
largest_diff_so_far = current_diff
'''storing actual number whose diff is largest till now.'''
actual_diff_number = [first, second]
'''below is main part for saving time. if in current process we find diff which is in minus means our second value is even less than first, in that case we no longer need to carry forward that first value so we will update first value to our current second value and will also update largest diff that is stored previously. since our first value is less than previous first value then our diff will also increase from previous diff.'''
elif current_diff < 0:
first = second
'''update largest diff with new first value'''
largest_diff_so_far = actual_diff_number[1] - first
'''update actual diff number's first value in that list'''
actual_diff_number[0] = first
'''finally print answer since after finishing for loop largest_diff_so_far and actual_diff_number contains the answer that we are finding.'''
print(actual_diff_number, largest_diff_so_far)

Related

How to compare value with value afterwards in a list?

x = [1,2,3,4,50,6,3,2,3,8]
for i in x:
if i > x[x.index(i)+1:10]:
print(i)
TypeError: '>' not supported between instances of 'int' and 'list'
I want to determine which number is larger than all the numbers afterward, in this circumstance, 50.
However, came out this error.
Does anyone have any idea to solve this problem?
This should work:
for i in range(len(x)):
if all(x[i] > x[j] for j in range(i + 1, len(x))):
print(i)
Please try these simple solutions and see which one fits best for your need. I have left comments briefly explaining what each case does. Keep coding in Python, it is a great computer language, pal!
numbers_lst = [1, 2, 3, 4, 50, 6, 3, 2, 3, 8]
# 1- Fast approach using built in max function
print("Using Max Built-in function: ", max(numbers_lst))
# 2- Manual Approach iterating all the elements of the list
max_num = numbers_lst[0]
for n in numbers_lst:
max_num = n if n >= max_num else max_num
print("Manual Iteration: ", max_num)
# 3- Using comprehensions, in this case for the list
max_num = numbers_lst[0]
[max_num:=n for n in numbers_lst if n >= max_num]
print("List Comprehension: ", max_num)
# 4- Sort the list in ascending order and print the last element in the list.
numbers_lst.sort()
# printing the last element, which is in this case the largest one
print("Using the sort list method:", numbers_lst[-1])
# 5 - Using the built in sorted function and getting the last list element afterwards
sorted_lst = sorted(numbers_lst, reverse=True)
max_num = sorted_lst[0]
print("Sorted List: ", max_num)
Although answer given by #Green Cloak Guy is a perfectly valid answer, it is inefficient as it is O(n^2) solution. I present below an O(n) solution by storing the greatest elements from the right.
x = [1,2,3,4,50,6,10,2,3,1]
largest = [0 for i in range (len(x))] # temporary array to store, for every index i, the largest number from the right to index i
largest[-1] = x[-1]
ans = [] # list to store numbers satisfying condition
for i in range (len(x) - 2, -1, -1):
if (x[i] > largest[i+1]):
ans.append(x[i])
largest[i] = max (x[i], largest[i+1])
for i in range (len(ans)-1,-1,-1): # print elements in the same order as given list
print (ans[i])
You could also make use of python's built in sorted method.
x = [1,2,3,4,50,6,3,2,3,8]
y = sorted(x, reverse=True)
print(y[0])
The issue is that you are comparing an individual value against a whole list of values. You will either have to make another loop or use the max() function on that sublist to get the highest value it contains.
A more efficient strategy is to process the list backwards while keeping track of the maximum number encountered.
By accumulating the maximum value backwards in the list and comparing it to the previous number, you can find all the numbers that are greater than all their successor. This produces results in reverse but the accumulate() function from the itertools module can be used to get these reversed maximum and the zip() function will allow you to combine them with the numbers in the original list for comparison/selection purposes.
x = [1,2,3,4,50,6,3,2,3,8]
from itertools import accumulate
r = [n for n,m in zip(x[-2::-1],accumulate(x[::-1],max)) if n>m]
print(r)
[50]
This returns a list because, depending on the data, there could be multiple numbers satisfying the condition:
x = [1,2,3,4,50,6,3,2,3,1]
r = [n for n,m in zip(x[-2::-1],accumulate(x[::-1],max)) if n>m][::-1]
print(r)
[50, 6, 3]

Python - efficient way to find first occurences of multiple values

I have a following problem: I need to find first occurences in an array for values greater than or equal than multiple other values.
Example:
array_1 = [-3,2,8,-1,0,5]
array_2 = [5,1]
Script has to find where in array_1 is the first value greater than or equal to each value from array_2 so the expected result in that case would be [3,2] for 1-based indices
A simple loop won't be any good for my case as both array have close to million values and it has to execute quickly preferably under a minute.
Simple loop solution that has a run time of about half an hour:
for j in range(0, len(array_2)):
for i in range(0, len(array_1)):
if array_1[i] >= array_2[j]:
solution[j] = i
break
Edit: indices clarification as #Sergio Tulentsev correctly pointed out
First perform some preprocessing on the data: create a new list that only has the values that are greater than all predecessors in the original data, and combine them in a tuple with the 1-based position where they were found.
So for instance, for the example data [-3,2,8,-1,0,5], this would be:
[(-3, 1), (2, 2), (8, 3)]
Note how the answer to any query can only be 1, 2 or 3, as the values at the other positions are all smaller than 8.
Then for each query use a binary search to find the tuple whose left value is at least the queried value, and return the right value of the found tuple (the position). For the binary search you can rely on the bisect library:
import bisect
def solve(data, queries):
# preprocessing
maxima = []
greatest = float("-inf")
for i, val in enumerate(data):
if val > greatest:
greatest = val
maxima.append((val, i+1))
# main
return [maxima[bisect.bisect_left(maxima, (query,))][1]
for query in queries]
Example use:
data = [-3,2,8,-1,0,5]
queries = [5,1]
print(solve(data, queries)) # [3, 2]
I suggest using a loop over the first array and using max(array_2) for the second one.

How can I efficiently merge two sets of ordered pairs, excluding pairs that are lower on both values from any other pair?

As part of a dynamical programming assignment, I find myself having to do the following.
I have two sorted lists of length 2 tuples (ordered pairs, representing scores on two criteria). One pair of values can only be considered strictly greater than another if it is greater on one criterion and not lower on the other. So, (1,8) and (2,7) are incomparable, while (1,7) is lower than (2,8).
Each input list contains only values that are incomparable with each other. My method merges the two lists, omitting duplicates as well as any values that are strictly inferior to another value in the new, bigger list. Then it sorts the new list.
For example, the following input produces this result:
combine([(1,8), (2, 6), (3, 4)], [(2, 7), (3, 3)])
[(1, 8), (2, 7), (3, 4)]
Here's the code I have currently produced:
def combine(left, right):
# start with lists sorted from biggest to smallest
newlist = left+right
leftlen = len(left)
for i in range(leftlen - 1, -1, -1):
l = newlist[i] # candidate value to be inserted in
for j in range(len(newlist) - 1, leftlen - 1, -1):
r = newlist[j]
if r[0] >= l[0]: # cell with >= food encountered without having previously encountered cell with less water
if r[1] >= l[1]: # this cell also has more water - del candidate
del newlist[i]
leftlen -=1
elif r[0] == l[0]: # equal food and less water than candidate - candidate wins
del newlist[j]
break # either way, no need to consider further cells -
# if this cell doesn't beat candidate, then candidate belongs
if r[1] <= l[1]: # cell with less water encountered before cell with more food
del newlist[j]
for k in range(j -1, leftlen - 1, -1): # continue through right list, deleting cells until a value with
# higher food is found
r = newlist[k]
if r[0] > l[0]: break
else: del newlist[k]
break
newlist.sort(reverse=True)
return newlist
This code does work, but I am wondering if there is a faster approach to solving this kind of problem? When the lists are long, I end up making a lot of pairwise comparisons.
I've tried to prune out some unnecessary comparisons, relying upon the fact that items in each list are always greater on one criterion and lesser on the other. Thus, the lists are reverse sorted on the first value in the tuple, and therefore also sorted on the second value!
One idea I had was to try and use a different ADT - some type of tree perhaps, but I'm not sure if this is going to help or not.
Any suggestions? This is for an assignment, so I'm looking for ideas rather than for somebody to rewrite the whole thing for me :) Cheers!

Removing points from list if distance between 2 points is below a certain threshold

I have a list of points and I want to keep the points of the list only if the distance between them is greater than a certain threshold. So, starting from the first point, if the the distance between the first point and the second is less than the threshold then I would remove the second point then compute the distance between the first one and the third one. If this distance is less than the threshold, compare the first and fourth point. Else move to the distance between the third and fourth and so on.
So for example, if the threshold is 2 and I have
list = [1, 2, 5, 6, 10]
then I would expect
new_list = [1, 5, 10]
Thank you!
Not a fancy one-liner, but you can just iterate the values in the list and append them to some new list if the current value is greater than the last value in the new list, using [-1]:
lst = range(10)
diff = 3
new = []
for n in lst:
if not new or abs(n - new[-1]) >= diff:
new.append(n)
Afterwards, new is [0, 3, 6, 9].
Concerning your comment "What if i had instead a list of coordinates (x,y)?": In this case you do exactly the same thing, except that instead of just comparing the numbers, you have to find the Euclidean distance between two points. So, assuming lst is a list of (x,y) pairs:
if not new or ((n[0]-new[-1][0])**2 + (n[1]-new[-1][1])**2)**.5 >= diff:
Alternatively, you can convert your (x,y) pairs into complex numbers. For those, basic operations such as addition, subtraction and absolute value are already defined, so you can just use the above code again.
lst = [complex(x,y) for x,y in lst]
new = []
for n in lst:
if not new or abs(n - new[-1]) >= diff: # same as in the first version
new.append(n)
print(new)
Now, new is a list of complex numbers representing the points: [0j, (3+3j), (6+6j), (9+9j)]
While the solution by tobias_k works, it is not the most efficient (in my opinion, but I may be overlooking something). It is based on list order and does not consider that the element which is close (within threshold) to the maximum number of other elements should be eliminated the last in the solution. The element that has the least number of such connections (or proximities) should be considered and checked first. The approach I suggest will likely allow retaining the maximum number of points that are outside the specified thresholds from other elements in the given list. This works very well for list of vectors and therefore x,y or x,y,z coordinates. If however you intend to use this solution with a list of scalars, you can simply include this line in the code orig_list=np.array(orig_list)[:,np.newaxis].tolist()
Please see the solution below:
import numpy as np
thresh = 2.0
orig_list=[[1,2], [5,6], ...]
nsamp = len(orig_list)
arr_matrix = np.array(orig_list)
distance_matrix = np.zeros([nsamp, nsamp], dtype=np.float)
for ii in range(nsamp):
distance_matrix[:, ii] = np.apply_along_axis(lambda x: np.linalg.norm(np.array(x)-np.array(arr_matrix[ii, :])),
1,
arr_matrix)
n_proxim = np.apply_along_axis(lambda x: np.count_nonzero(x < thresh),
0,
distance_matrix)
idx = np.argsort(n_proxim).tolist()
idx_out = list()
for ii in idx:
for jj in range(ii+1):
if ii not in idx_out:
if self.distance_matrix[ii, jj] < thresh:
if ii != jj:
idx_out.append(jj)
pop_idx = sorted(np.unique(idx_out).tolist(),
reverse=True)
for pop_id in pop_idx:
orig_list.pop(pop_id)
nsamp = len(orig_list)

Select items around a value in a sorted list with multiple repeated values

I'm trying to select some elements in a python list. The list represents a distribution of the sizes of some other elements, so it contains multiple repeated values.
After I find the average value on this list, I want to pick those elements which value lies between an upper bound and a lower bound around that average value. I can do that easily, but it selects too many elements (mainly because the distribution I have to work with is pretty much homogeneous). So I would like to be able to select the bounds where to chose the values, but also limit the spread of the search to like 5 elements below the average and 5 elements above.
I'll add my code (it is super simple).
avg_lists = sum_lists/len(lists)
num_list = len(list)
if (int(num_comm/10)%2 == 0):
window_size = int(num_list/10)
else:
window_size = int(num_list/10)-1
out_file = open('chosenLists', 'w+')
chosen_lists = []
for list in lists:
if ((len(list) >= (avg_lists-window_size)) & (len(list)<=(avg_lists+window_size))):
chosen_lists.append(list)
out_file.write("%s\n" % list)
If you are allowed to use median instead of average then you can use this simple solution:
def select(l, n):
assert n <= len(l)
s = sorted(l) # sort the list
i = (len(s) - n) // 2
return s[i:i+n] # return sublist of n elements from the middle
print select([1,2,3,4,5,1,2,3,4,5], 5) # shows [2, 2, 3, 3, 4]
The function select returns n elements closest to the median.

Categories