Top n closest numbers from a python list

Top n closest numbers from a python list - python

I often need to select a certain amount of numbers from a list, so that they are the closest ones to some other certain number.
For example:
x0 = 45
n = 3
mylist = [12,32,432,43,54,234,23,543,2]
So, how do I select n numbers from the list which are the closest ones to x0? Is there some built-in method?
topN = [43, 54, 32]
The way I see is below, however it looks a bit convoluted:
diffs = sorted([(abs(x - x0), x) for x in mylist])
topN = [d[1] for d in diffs[:n]]

Use heapq.nsmallest:
heapq.nsmallest(n, iterable[, key])
Return a list with the n smallest elements from the dataset defined by iterable. key, if provided, specifies a function of one argument that is used to extract a comparison key from each element in the iterable: key=str.lower Equivalent to: sorted(iterable, key=key)[:n]
So in your particular case:
import heapq
x0 = 45
n = 3
mylist = [12,32,432,43,54,234,23,543,2]
heapq.nsmallest(n, mylist, key=lambda x: abs(x-x0))
This uses less overhead because it discards elements as they exceed n.

Can be alternatively done using sorting by custom function:
sorted(mylist, key = lambda x : abs(x-x0))[:n]
It is slower than heapq.nsmallest in terms of time complexity, however has less overhead and thus is more efficient for small lists.

Related

List comprehension with lambda

How can I rewrite this loop using lambda and list-comprehension?
n = 17
e = 0
for i in range(0, n):
e += 1 / factorial(i)
print(e)
So far I have tried this...but it's not working
lst = [item for item in map(lambda e: e + 1/factorial(i), range(0,n))])
myMap = map(lambda e: e + 1/factorial(i), range(0,n))

Since list comprehension and lambda do not fit well in this case, here some functional programming base approaches to, hope, make clear the difference (between a comprehension expression):
A reduce approach: a list is contracted to a "number" via a functional
from math import factorial
from functools import reduce
n = 10
reduce(lambda i, j: i + 1/factorial(j), range(n), 0)
#2.7182815255731922
here with map: consecutive 1-to-1 correspondences in which a function is applied termwise. Finally, the sum functional contracts the list into a "number".
Note: it should be read from right to left, apply the factorial, then the reciprocal and sum it.
sum(map(int(-1).__rpow__, map(factorial, range(n))))
#2.7182815255731922

I would use the sum function for this. I also don't understand why you are using the map function together with a list comprehension.
e = sum([1 / factorial(i) for i in range(n)])
As you can see, list comprehensions can also be used to modify the returned list. You don't need to do complex and not working things using map and lambda.
To be complete, the map function returns the sum of all values in a list. For example:
>>> l = [1, 2, 3, 4, 5, 6]
>>>
>>> sum(l)
21
When your n value gets larger and you need to start paying attention at efficiency, you can use the following code:
sum(1 / factorial(i) for i in range(n))
I removed the square brackets here. This means that python creates a generator instance instaed of a list instance. A generator does not have all its values precalculated, instead it calculates them only when they are needed. This is way more efficient then a list, which first calculates all the values and then iterates over them.

Check if any keys in a dictionary match any values

I'm trying to solve this problem:
Let d(n) be defined as the sum of proper divisors of n (numbers less than n which divide evenly into n).
If d(a) = b and d(b) = a, where a ≠ b, then a and b are an amicable pair and each of a and b are called amicable numbers.
For example, the proper divisors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110; therefore d(220) = 284. The proper divisors of 284 are 1, 2, 4, 71 and 142; so d(284) = 220.
Evaluate the sum of all the amicable numbers under 10000.
I came up with a dictionary that holds x:d(x) for all numbers 0 to 9999 like so:
sums = {x:sum(alecproduct.find_factors(x))-x for x,y in enumerate(range(10**4))}
Where alecproduct.findfactors is a function from my own module that returns a list of all the factors of a number
I'm not sure where to go from here, though. I've tried iterating over the dictionary and creating tuples out of each k-v pair like so:
for k,v in sums.items():
dict_tups.append((k,v))
But I don't think this helps me. Any advice on how I can detect if any of the dictionary keys match any of the dictionary values?
Edit - My solution based on 6502's answer:
sums,ap = {x:sum(find_factors(x))-x for x,y in enumerate(range(10**4))}, []
for x in sums:
y = sums[x]
if sums.get(y) == x and x != y:
ap.append(x)
print(ap)
print('\nSum: ', sum(ap))

Your problem is almost solved already... just get all couples out:
for x in my_dict:
y = my_dict[x]
if my_dict.get(y) == x:
# x/y is an amicable pair
...
note that every pair will be extracted twice (both x/y and y/x) and perfect numbers (numbers that are the sum of their divisors) only once; not sure from your problem text if 6/6 is considered an amicable pair or not.

This code should give you a list of all keys that are also values.
my_test = [key for key in my_dict.keys() if key in my_dict.values()]
You don't need .keys() because, this is the default behavior however, I wanted to be explicit for this example.
Alternatively, a for loop example can be seen below.
for key, value in my_dict.iteritems():
if key == value:
print key # or do stuff

iterating through keys and values of your sums dictionary to create a new list with all the amicable numbers solves the problem, here is the code snippet.
amicable_list=[]
for i in sums.keys():
if i in sums.values():
if (sums.get(sums.get(i,-1),-1) == i) and (i != sums[i]):
amicable_list.append(i)

You could use sets:
x_dx = {(x, sum(alecproduct.find_factors(x)) - x) for x in range(10 ** 4)}
x_dx = {t for t in x_dx if t[0] != t[1]}
dx_x = {(t[1], t[0]) for t in x_dx}
amicable_pairs = x_dx & dx_x
As in 6502's answer, all amicable pairs are extracted twice.
A way to remove these 'duplicates' could be (although it's certainly a mouthful):
amicable_pairs_sorted = {tuple(sorted(t)) for t in amicable_pairs}
amicable_pairs_ascending = sorted(list(amicable_pairs_sorted))

Check number not a sum of 2 ints on a list

Given a list of integers, I want to check a second list and remove from the first only those which can not be made from the sum of two numbers from the second. So given a = [3,19,20] and b = [1,2,17], I'd want [3,19].
Seems like a a cinch with two nested loops - except that I've gotten stuck with break and continue commands.
Here's what I have:
def myFunction(list_a, list_b):
for i in list_a:
for a in list_b:
for b in list_b:
if a + b == i:
break
else:
continue
break
else:
continue
list_a.remove(i)
return list_a
I know what I need to do, just the syntax seems unnecessarily confusing. Can someone show me an easier way? TIA!

You can do like this,
In [13]: from itertools import combinations
In [15]: [item for item in a if item in [sum(i) for i in combinations(b,2)]]
Out[15]: [3, 19]
combinations will give all possible combinations in b and get the list of sum. And just check the value is present in a
Edit
If you don't want to use the itertools wrote a function for it. Like this,
def comb(s):
for i, v1 in enumerate(s):
for j in range(i+1, len(s)):
yield [v1, s[j]]
result = [item for item in a if item in [sum(i) for i in comb(b)]]

Comments on code:
It's very dangerous to delete elements from a list while iterating over it. Perhaps you could append items you want to keep to a new list, and return that.
Your current algorithm is O(nm^2), where n is the size of list_a, and m is the size of list_b. This is pretty inefficient, but a good start to the problem.
Thee's also a lot of unnecessary continue and break statements, which can lead to complicated code that is hard to debug.
You also put everything into one function. If you split up each task into different functions, such as dedicating one function to finding pairs, and one for checking each item in list_a against list_b. This is a way of splitting problems into smaller problems, and using them to solve the bigger problem.
Overall I think your function is doing too much, and the logic could be condensed into much simpler code by breaking down the problem.
Another approach:
Since I found this task interesting, I decided to try it myself. My outlined approach is illustrated below.
1. You can first check if a list has a pair of a given sum in O(n) time using hashing:
def check_pairs(lst, sums):
lookup = set()
for x in lst:
current = sums - x
if current in lookup:
return True
lookup.add(x)
return False
2. Then you could use this function to check if any any pair in list_b is equal to the sum of numbers iterated in list_a:
def remove_first_sum(list_a, list_b):
new_list_a = []
for x in list_a:
check = check_pairs(list_b, x)
if check:
new_list_a.append(x)
return new_list_a
Which keeps numbers in list_a that contribute to a sum of two numbers in list_b.
3. The above can also be written with a list comprehension:
def remove_first_sum(list_a, list_b):
return [x for x in list_a if check_pairs(list_b, x)]
Both of which works as follows:
>>> remove_first_sum([3,19,20], [1,2,17])
[3, 19]
>>> remove_first_sum([3,19,20,18], [1,2,17])
[3, 19, 18]
>>> remove_first_sum([1,2,5,6],[2,3,4])
[5, 6]
Note: Overall the algorithm above is O(n) time complexity, which doesn't require anything too complicated. However, this also leads to O(n) extra auxiliary space, because a set is kept to record what items have been seen.

You can do it by first creating all possible sum combinations, then filtering out elements which don't belong to that combination list
Define the input lists
>>> a = [3,19,20]
>>> b = [1,2,17]
Next we will define all possible combinations of sum of two elements
>>> y = [i+j for k,j in enumerate(b) for i in b[k+1:]]
Next we will apply a function to every element of list a and check if it is present in above calculated list. map function can be use with an if/else clause. map will yield None in case of else clause is successful. To cater for this we can filter the list to remove None values
>>> list(filter(None, map(lambda x: x if x in y else None,a)))
The above operation will output:
>>> [3,19]
You can also write a one-line by combining all these lines into one, but I don't recommend this.

you can try something like that:
a = [3,19,20]
b= [1,2,17,5]
n_m_s=[]
data=[n_m_s.append(i+j) for i in b for j in b if i+j in a]
print(set(n_m_s))
print("after remove")
final_data=[]
for j,i in enumerate(a):
if i not in n_m_s:
final_data.append(i)
print(final_data)
output:
{19, 3}
after remove
[20]

Sort list of lists by unique reversed absolute condition

Context - developing algorithm to determine loop flows in a power flow network.
Issue:
I have a list of lists, each list represents a loop within the network determined via my algorithm. Unfortunately, the algorithm will also pick up the reversed duplicates.
i.e.
L1 = [a, b, c, -d, -a]
L2 = [a, d, c, -b, -a]
(Please note that c should not be negative, it is correct as written due to the structure of the network and defined flows)
Now these two loops are equivalent, simply following the reverse structure throughout the network.
I wish to retain L1, whilst discarding L2 from the list of lists.
Thus if I have a list of 6 loops, of which 3 are reversed duplicates I wish to retain all three.
Additionally, The loop does not have to follow the format specified above. It can be shorter, longer, and the sign structure (e.g. pos pos pos neg neg) will not occur in all instances.
I have been attempting to sort this by reversing the list and comparing the absolute values.
I am completely stumped and any assistance would be appreciated.
Based upon some of the code provided by mgibson I was able to create the following.
def Check_Dup(Loops):
Act = []
while Loops:
L = Loops.pop()
Act.append(L)
Loops = Popper(Loops, L)
return Act
def Popper(Loops, L):
for loop in Loops:
Rev = loop[::-1]
if all (abs(x) == abs(y) for x, y in zip(loop_check, Rev)):
Loops.remove(loop)
return Loops
This code should run until there are no loops left discarding the duplicates each time. I'm accepting mgibsons answers as it provided the necessary keys to create the solution

I'm not sure I get your question, but reversing a list is easy:
a = [1,2]
a_rev = a[::-1] #new list -- if you just want an iterator, reversed(a) also works.
To compare the absolute values of a and a_rev:
all( abs(x) == abs(y) for x,y in zip(a,a_rev) )
which can be simplified to:
all( abs(x) == abs(y) for x,y in zip(a,reversed(a)) )
Now, in order to make this as efficient as possible, I would first sort the arrays based on the absolute value:
your_list_of_lists.sort(key = lambda x : map(abs,x) )
Now you know that if two lists are going to be equal, they have to be adjacent in the list and you can just pull that out using enumerate:
def cmp_list(x,y):
return True if x == y else all( abs(a) == abs(b) for a,b in zip(a,b) )
duplicate_idx = [ idx for idx,val in enumerate(your_list_of_lists[1:])
if cmp_list(val,your_list_of_lists[idx]) ]
#now remove duplicates:
for idx in reversed(duplicate_idx):
_ = your_list_of_lists.pop(idx)
If your (sub) lists are either strictly increasing or strictly decreasing, this becomes MUCH simpler.
lists = list(set( tuple(sorted(x)) for x in your_list_of_lists ) )

I don't see how they can be equivalent if you have c in both directions - one of them must be -c
>>> a,b,c,d = range(1,5)
>>> L1 = [a, b, c, -d, -a]
>>> L2 = [a, d, -c, -b, -a]
>>> L1 == [-x for x in reversed(L2)]
True
now you can write a function to collapse those two loops into a single value
>>> def normalise(loop):
... return min(loop, [-x for x in reversed(L2)])
...
>>> normalise(L1)
[1, 2, 3, -4, -1]
>>> normalise(L2)
[1, 2, 3, -4, -1]
A good way to eliminate duplicates is to use a set, we just need to convert the lists to tuples
>>> L=[L1, L2]
>>> set(tuple(normalise(loop)) for loop in L)
set([(1, 2, 3, -4, -1)])

[pair[0] for pair in frozenset(sorted( (c,negReversed(c)) ) for c in cycles)]
Where:
def negReversed(list):
return tuple(-x for x in list[::-1])
and where cycles must be tuples.
This takes each cycle, computes its duplicate, and sorts them (putting them in a pair that are canonically equivalent). The set frozenset(...) uniquifies any duplicates. Then you extract the canonical element (in this case I arbitrarily chose it to be pair[0]).
Keep in mind that your algorithm might be returning cycles starting in arbitrary places. If this is the case (i.e. your algorithm might return either [1,2,-3] or [-3,1,2]), then you need to consider these as equivalent necklaces
There are many ways to canonicalize necklaces. The above way is less efficient because we don't care about canonicalizing the necklace directly: we just treat the entire equivalence class as the canonical element, by turning each cycle (a,b,c,d,e) into {(a,b,c,d,e), (e,a,b,c,d), (d,e,a,b,c), (c,d,e,a,b), (b,c,d,e,a)}. In your case since you consider negatives to be equivalent, you would turn each cycle into {(a,b,c,d,e), (e,a,b,c,d), (d,e,a,b,c), (c,d,e,a,b), (b,c,d,e,a), (-a,-b,-c,-d,-e), (-e,-a,-b,-c,-d), (-d,-e,-a,-b,-c), (-c,-d,-e,-a,-b), (-b,-c,-d,-e,-a)}. Make sure to use frozenset for performance, as set is not hashable:
eqClass.pop() for eqClass in {frozenset(eqClass(c)) for c in cycles}
where:
def eqClass(cycle):
for rotation in rotations(cycle):
yield rotation
yield (-x for x in rotation)
where rotation is something like Efficient way to shift a list in python but yields a tuple

Max Value within a List of Lists of Tuple

I have a problem to get the highest Value in a dynamic List of Lists of Tuples.
The List can looks like this:
adymlist = [[('name1',1)],[('name2',2),('name3',1), ...('name10', 20)], ...,[('name m',int),..]]
Now I loop through the List to get the highest Value (integer):
total = {}
y=0
while y < len(adymlist):
if len(adymlist) == 1:
#has the List only 1 Element -> save it in total
total[adymlist[y][0][0]] = adymlist[y][0][1]
y += 1
else:
# here is the problem
# iterate through each lists to get the highest Value
# and you dont know how long this list can be
# safe the highest Value in total f.e. total = {'name1':1,'name10':20, ..}
I tried a lot to get the maximum Value but I found no conclusion to my problem. I know i must loop through each Tuple in the List and compare it with the next one but it dont know how to code it correct.
Also I can use the function max() but it doesnt work with strings and integers. f.e.
a = [ ('a',5),('z',1)] -> result is max(a) ---> ('z',1) obv 5 > 1 but z > a so I tried to expand the max function with max(a, key=int) but I get an Type Error.
Hope you can understand what I want ;-)
UPDATE
Thanks so far.
If I use itertools.chain(*adymlist) and max(flatlist, key=lambda x: x[1])
I will get an exception like : max_word = max(flatlist, key=lambda x: x[1])
TypeError: 'int' object is unsubscriptable
BUT If I use itertools.chain(adymlist) it works fine. But I dont know how to summate all integers from each Tuple of the List. I need your help to figure it out.
Otherwise I wrote a workaround for itertools.chain(*adymlist) to get the sum of all integers and the highest integer in that list.
chain = itertools.chain(*adymlist)
flatlist = list(chain)
# flatlist = string, integer, string, integer, ...
max_count = max(flatlist[1:len(flatlist):2])
total_count = sum(flatlist[1:len(flatlist):2])
# index of highest integer
idx = flatlist.index(next((n for n in flatlist if n == max_count)))
max_keyword = flatlist[idx-1]
It still does what I want, but isn't it to dirty?

To clarify, looks like you've got a list of lists of tuples. It doesn't look like we care about what list they are in, so we can simplify this to two steps
Flatten the list of lists to a list of tuples
Find the max value
The first part can be accomplished via itertools.chain (see e.g., Flattening a shallow list in Python)
The second can be solved through max, you have the right idea, but you should be passing in a function rather than the type you want. This function needs to return the value you've keyed on, in this case ,the second part of the tuple
max(flatlist, key=lambda x: x[1])
Correction
I re-read your question - are you looking for the max value in each sub-list? If this is the case, then only the second part is applicable. Simply iterate over your list for each list
A bit more pythonic than what you currently have would like
output = []
for lst in lists:
output.append( max(flatlist, key=lambda x: x[1]) )
or
map(lambda x: max(x, key=lambda y: y[1]) , lists)

As spintheblack says, you have a list of lists of tuples. I presume you are looking for the highest integer value of all tuples.
You can iterate over the outer list, then over the list of tuples tuples like this:
max_so_far = 0
for list in adymlist:
for t in list:
if t[1] > max_so_far:
max_so_far = t[1]
print max_so_far
This is a little bit more verbose but might be easier to understand.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Top n closest numbers from a python list - python

Can be alternatively done using sorting by custom function: sorted(mylist, key = lambda x : abs(x-x0))[:n] It is slower than heapq.nsmallest in terms of time complexity, however has less overhead and thus is more efficient for small lists.

Related

List comprehension with lambda

Check if any keys in a dictionary match any values

Check number not a sum of 2 ints on a list

Sort list of lists by unique reversed absolute condition

Max Value within a List of Lists of Tuple

Categories

Resources