Check if any keys in a dictionary match any values - python

I'm trying to solve this problem:
Let d(n) be defined as the sum of proper divisors of n (numbers less than n which divide evenly into n).
If d(a) = b and d(b) = a, where a ≠ b, then a and b are an amicable pair and each of a and b are called amicable numbers.
For example, the proper divisors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110; therefore d(220) = 284. The proper divisors of 284 are 1, 2, 4, 71 and 142; so d(284) = 220.
Evaluate the sum of all the amicable numbers under 10000.
I came up with a dictionary that holds x:d(x) for all numbers 0 to 9999 like so:
sums = {x:sum(alecproduct.find_factors(x))-x for x,y in enumerate(range(10**4))}
Where alecproduct.findfactors is a function from my own module that returns a list of all the factors of a number
I'm not sure where to go from here, though. I've tried iterating over the dictionary and creating tuples out of each k-v pair like so:
for k,v in sums.items():
dict_tups.append((k,v))
But I don't think this helps me. Any advice on how I can detect if any of the dictionary keys match any of the dictionary values?
Edit - My solution based on 6502's answer:
sums,ap = {x:sum(find_factors(x))-x for x,y in enumerate(range(10**4))}, []
for x in sums:
y = sums[x]
if sums.get(y) == x and x != y:
ap.append(x)
print(ap)
print('\nSum: ', sum(ap))

Your problem is almost solved already... just get all couples out:
for x in my_dict:
y = my_dict[x]
if my_dict.get(y) == x:
# x/y is an amicable pair
...
note that every pair will be extracted twice (both x/y and y/x) and perfect numbers (numbers that are the sum of their divisors) only once; not sure from your problem text if 6/6 is considered an amicable pair or not.

This code should give you a list of all keys that are also values.
my_test = [key for key in my_dict.keys() if key in my_dict.values()]
You don't need .keys() because, this is the default behavior however, I wanted to be explicit for this example.
Alternatively, a for loop example can be seen below.
for key, value in my_dict.iteritems():
if key == value:
print key # or do stuff

iterating through keys and values of your sums dictionary to create a new list with all the amicable numbers solves the problem, here is the code snippet.
amicable_list=[]
for i in sums.keys():
if i in sums.values():
if (sums.get(sums.get(i,-1),-1) == i) and (i != sums[i]):
amicable_list.append(i)

You could use sets:
x_dx = {(x, sum(alecproduct.find_factors(x)) - x) for x in range(10 ** 4)}
x_dx = {t for t in x_dx if t[0] != t[1]}
dx_x = {(t[1], t[0]) for t in x_dx}
amicable_pairs = x_dx & dx_x
As in 6502's answer, all amicable pairs are extracted twice.
A way to remove these 'duplicates' could be (although it's certainly a mouthful):
amicable_pairs_sorted = {tuple(sorted(t)) for t in amicable_pairs}
amicable_pairs_ascending = sorted(list(amicable_pairs_sorted))

Related

Efficiently combine two uneven lists into dictionary based on condition

I have two lists of tuples and I want to map elements in e to elements in s based on a condition. The condition is that the 1st element of something in e needs to be >= to the 1st element in s and elements 1+2 in e need to be <= to elements 1+2 in s. The 1st number in each s tuple is a start position and the second is the length. I can do it as follows:
e = [('e1',3,3),('e2',6,2),('e3',330,3)]
s = [('s1',0,10),('s2',11,24),('s3',35,35),('s4',320,29)]
for i in e:
d = {i:j for j in s if i[1]>=j[1] and i[1]+i[2]<=j[1]+j[2]}
print(d)
Output (what I want):
{('e1', 3, 3): ('s1', 0, 10)}
{('e2', 6, 2): ('s1', 0, 10)}
{('e3', 330, 3): ('s4', 320, 29)}
Is there a more efficient way to get to this result, ideally without the for loop (at least not a double loop)? I've have tried some things with zip as well as something along the lines of
list(map(lambda i,j: {i:j} if i[1]>=j[1] and i[1]+i[2]<=j[1]+j[2] else None, e, s))
but it is not giving me quite what I am looking for.
The elements in s will never overlap. For example, you wouldn't have ('s1',0,10) and ('s2', 5,15). In other words, the range (0, 0+10) will never overlap with (5,5+15) or that of any other tuple. Additionally, all the tuples in e will be unique.
The constraint that the tuples in s can't overlap is pretty strong. In particular, it implies that each value in e can only match with at most one value in s (I think the easiest way to show that is to assume two distinct, non-overlapping tuples in s match with a single element in e and derive a contradiction).
Because of that property, any s-tuple s1 matching an e-tuple e1 has the property that among all tuples sx in s with sx[1]<=e1[1], it is the one with the greatest sum(sx[1:]), since if it weren't then the s-tuple with a small enough 1st coordinate and greater sum would also be a match (which we already know is impossible).
That observation lends itself to a fairly simple algorithm where we linearly walk through both e and s (sorted), keeping track of the s-tuple with the biggest sum. If that sum is big enough compared to the e-tuple we're looking at, add the pair to our result set.
def pairs(e, s):
# Might be able to skip or replace with .sort() depending
# on the rest of your program
e = sorted(e, key=lambda t: t[1])
s = sorted(s, key=lambda t: t[1])
i, max_seen, max_sum, results = 0, None, None, {}
for ex in e:
while i < len(s) and (sx:=s[i])[1] <= ex[1]:
if not max_seen or sum(sx[1:]) > max_sum:
max_seen, max_sum = sx, sum(sx[1:])
i += 1
if max_seen and max_sum > sum(ex[1:]):
results[ex] = s[i]
return results

Indexing a list with an unique index

I have a list say l = [10,10,20,15,10,20]. I want to assign each unique value a certain "index" to get [1,1,2,3,1,2].
This is my code:
a = list(set(l))
res = [a.index(x) for x in l]
Which turns out to be very slow.
l has 1M elements, and 100K unique elements. I have also tried map with lambda and sorting, which did not help. What is the ideal way to do this?
You can do this in O(N) time using a defaultdict and a list comprehension:
>>> from itertools import count
>>> from collections import defaultdict
>>> lst = [10, 10, 20, 15, 10, 20]
>>> d = defaultdict(count(1).next)
>>> [d[k] for k in lst]
[1, 1, 2, 3, 1, 2]
In Python 3 use __next__ instead of next.
If you're wondering how it works?
The default_factory(i.e count(1).next in this case) passed to defaultdict is called only when Python encounters a missing key, so for 10 the value is going to be 1, then for the next ten it is not a missing key anymore hence the previously calculated 1 is used, now 20 is again a missing key and Python will call the default_factory again to get its value and so on.
d at the end will look like this:
>>> d
defaultdict(<method-wrapper 'next' of itertools.count object at 0x1057c83b0>,
{10: 1, 20: 2, 15: 3})
The slowness of your code arises because a.index(x) performs a linear search and you perform that linear search for each of the elements in l. So for each of the 1M items you perform (up to) 100K comparisons.
The fastest way to transform one value to another is looking it up in a map. You'll need to create the map and fill in the relationship between the original values and the values you want. Then retrieve the value from the map when you encounter another of the same value in your list.
Here is an example that makes a single pass through l. There may be room for further optimization to eliminate the need to repeatedly reallocate res when appending to it.
res = []
conversion = {}
i = 0
for x in l:
if x not in conversion:
value = conversion[x] = i
i += 1
else:
value = conversion[x]
res.append(value)
Well I guess it depends on if you want it to return the indexes in that specific order or not. If you want the example to return:
[1,1,2,3,1,2]
then you can look at the other answers submitted. However if you only care about getting a unique index for each unique number then I have a fast solution for you
import numpy as np
l = [10,10,20,15,10,20]
a = np.array(l)
x,y = np.unique(a,return_inverse = True)
and for this example the output of y is:
y = [0,0,2,1,0,2]
I tested this for 1,000,000 entries and it was done essentially immediately.
Your solution is slow because its complexity is O(nm) with m being the number of unique elements in l: a.index() is O(m) and you call it for every element in l.
To make it O(n), get rid of index() and store indexes in a dictionary:
>>> idx, indexes = 1, {}
>>> for x in l:
... if x not in indexes:
... indexes[x] = idx
... idx += 1
...
>>> [indexes[x] for x in l]
[1, 1, 2, 3, 1, 2]
If l contains only integers in a known range, you could also store indexes in a list instead of a dictionary for faster lookups.
You can use collections.OrderedDict() in order to preserve the unique items in order and, loop over the enumerate of this ordered unique items in order to get a dict of items and those indices (based on their order) then pass this dictionary with the main list to operator.itemgetter() to get the corresponding index for each item:
>>> from collections import OrderedDict
>>> from operator import itemgetter
>>> itemgetter(*lst)({j:i for i,j in enumerate(OrderedDict.fromkeys(lst),1)})
(1, 1, 2, 3, 1, 2)
For completness, you can also do it eagerly:
from itertools import count
wordid = dict(zip(set(list_), count(1)))
This uses a set to obtain the unique words in list_, pairs
each of those unique words with the next value from count() (which
counts upwards), and constructs a dictionary from the results.
Original answer, written by nneonneo.

How to find second smallest UNIQUE number in a list?

I need to create a function that returns the second smallest unique number, which means if
list1 = [5,4,3,2,2,1], I need to return 3, because 2 is not unique.
I've tried:
def second(list1):
result = sorted(list1)[1]
return result
and
def second(list1):
result = list(set((list1)))
return result
but they all return 2.
EDIT1:
Thanks guys! I got it working using this final code:
def second(list1):
b = [i for i in list1 if list1.count(i) == 1]
b.sort()
result = sorted(b)[1]
return result
EDIT 2:
Okay guys... really confused. My Prof just told me that if list1 = [1,1,2,3,4], it should return 2 because 2 is still the second smallest number, and if list1 = [1,2,2,3,4], it should return 3.
Code in eidt1 wont work if list1 = [1,1,2,3,4].
I think I need to do something like:
if duplicate number in position list1[0], then remove all duplicates and return second number.
Else if duplicate number postion not in list1[0], then just use the code in EDIT1.
Without using anything fancy, why not just get a list of uniques, sort it, and get the second list item?
a = [5,4,3,2,2,1] #second smallest is 3
b = [i for i in a if a.count(i) == 1]
b.sort()
>>>b[1]
3
a = [5,4,4,3,3,2,2,1] #second smallest is 5
b = [i for i in a if a.count(i) == 1]
b.sort()
>>> b[1]
5
Obviously you should test that your list has at least two unique numbers in it. In other words, make sure b has a length of at least 2.
Remove non unique elements - use sort/itertools.groupby or collections.Counter
Use min - O(n) to determine the minimum instead of sort - O(nlongn). (In any case if you are using groupby the data is already sorted) I missed the fact that OP wanted the second minimum, so sorting is still a better option here
Sample Code
Using Counter
>>> sorted(k for k, v in Counter(list1).items() if v == 1)[1]
1
Using Itertools
>>> sorted(k for k, g in groupby(sorted(list1)) if len(list(g)) == 1)[1]
3
Here's a fancier approach that doesn't use count (which means it should have significantly better performance on large datasets).
from collections import defaultdict
def getUnique(data):
dd = defaultdict(lambda: 0)
for value in data:
dd[value] += 1
result = [key for key in dd.keys() if dd[key] == 1]
result.sort()
return result
a = [5,4,3,2,2,1]
b = getUnique(a)
print(b)
# [1, 3, 4, 5]
print(b[1])
# 3
Okay guys! I got the working code thanks to all your help and helping me to think on the right track. This code works:
`def second(list1):
if len(list1)!= len(set(list1)):
result = sorted(list1)[2]
return result
elif len(list1) == len(set(list1)):
result = sorted(list1)[1]
return result`
Okay, here usage of set() on a list is not going to help. It doesn't purge the duplicated elements. What I mean is :
l1=[5,4,3,2,2,1]
print set(l1)
Prints
[0, 1, 2, 3, 4, 5]
Here, you're not removing the duplicated elements, but the list gets unique
In your example you want to remove all duplicated elements.
Try something like this.
l1=[5,4,3,2,2,1]
newlist=[]
for i in l1:
if l1.count(i)==1:
newlist.append(i)
print newlist
This in this example prints
[5, 4, 3, 1]
then you can use heapq to get your second largest number in your list, like this
print heapq.nsmallest(2, newlist)[-1]
Imports : import heapq, The above snippet prints 3 for you.
This should to the trick. Cheers!

Top n closest numbers from a python list

I often need to select a certain amount of numbers from a list, so that they are the closest ones to some other certain number.
For example:
x0 = 45
n = 3
mylist = [12,32,432,43,54,234,23,543,2]
So, how do I select n numbers from the list which are the closest ones to x0? Is there some built-in method?
topN = [43, 54, 32]
The way I see is below, however it looks a bit convoluted:
diffs = sorted([(abs(x - x0), x) for x in mylist])
topN = [d[1] for d in diffs[:n]]
Use heapq.nsmallest:
heapq.nsmallest(n, iterable[, key])
Return a list with the n smallest elements from the dataset defined by iterable. key, if provided, specifies a function of one argument that is used to extract a comparison key from each element in the iterable: key=str.lower Equivalent to: sorted(iterable, key=key)[:n]
So in your particular case:
import heapq
x0 = 45
n = 3
mylist = [12,32,432,43,54,234,23,543,2]
heapq.nsmallest(n, mylist, key=lambda x: abs(x-x0))
This uses less overhead because it discards elements as they exceed n.
Can be alternatively done using sorting by custom function:
sorted(mylist, key = lambda x : abs(x-x0))[:n]
It is slower than heapq.nsmallest in terms of time complexity, however has less overhead and thus is more efficient for small lists.

find value of forloop at which event occurred Python

hey guys, this is very confusing...
i am trying to find the minimum of an array by:
for xpre in range(100): #used pre because I am using vapor pressures with some x molarity
xvalue=xarray[xpre]
for ppre in range(100): #same as xpre but vapor pressures for pure water, p
pvalue=parray[p]
d=math.fabs(xvalue-pvalue) #d represents the difference(due to vapor pressure lowering, a phenomenon in chemistry)
darray.append(d) #darray stores the differences
mini=min(darray) #mini is the minimum value in darray
darr=[] #this is to make way for a new set of floats
all the arrays (xarr,parr,darr)are already defined and what not. they have 100 floats each
so my question is how would I find the pvap and the xvap # which min(darr) is found?
edit
have changed some variable names and added variable descriptions, sorry guys
A couple things:
Try enumerate
Instead of darr being a list, use a dict and store the dvp values as keys, with the xindex and pindex variables as values
Here's the code
for xindex, xvalue in enumerate(xarr):
darr = {}
for pindex, pvalue in enumerate(parr):
dvp = math.fabs(xvalue - pvalue)
darr[dvp] = {'xindex': xindex, 'pindex': pindex}
mini = min(darr.keys())
minix = darr[mini]['xindex']
minip = darr[mini]['pindex']
minindex = darr.keys().index(mini)
print "minimum_index> {0}, is the difference of xarr[{1}] and parr[{2}]".format(minindex, minix, minip)
darr.clear()
Explanation
The enumerate function allows you to iterate over a list and also receive the index of the item. It is an alternative to your range(100). Notice that I don't have the line where I get the value at index xpre, ppre, this is because the enumerate function gives me both index and value as a tuple.
The most important change, however, is that instead of your darr being a list like this:
[130, 18, 42, 37 ...]
It is now a dictionary like this:
{
130: {'xindex': 1, 'pindex': 4},
18: {'xindex': 1, 'pindex': 6},
43: {'xindex': 1, 'pindex': 9},
...
}
So now, instead of just storing the dvp values alone, I am also storing the indices into x and p which generated those dvp values. Now, if I want to know something, say, Which x and p values produce the dvp value of 43? I would do this:
xindex = darr[43]['xindex']
pindex = darr[43]['pindex']
x = xarr[xindex]
p = parr[pindex]
Now x and p are the values in question.
Note I personally would store the values which produced a particular dvp, and not the indices of those values. But you asked for the indices so I gave you that answer. I'm going to assume that you have a reason for wanting to handle indices like this, but in Python generally you do not find yourself handling indices in this way when you are programming in Pythonic manner. This is a very C way of doing things.
Edit: This doesn't answer the OP's question:
min_diff, min_idx = min((math.fabs(a - b), i) for i, (a, b) in enumerate(zip(xpre, ppre)
right to left:
zip takes xpre and ppre and makes a tuple of the 1st, 2nd, ... elements respectively, like so:
[ (xpre[0],ppre[0]) , (xpre[1],ppre[1]) , ... ]
enumerate enumerates adds the index by just counting upwards from 0:
[ (0 , (xpre[0],ppre[0]) ) , (1 , (xpre[1],ppre[1]) ) , ... ]
This unpacks each nestet tuple:
for i, (a, b) in ...
i is the index generated by enumerate, a and b are the elements of xarr and parr.
This builds a tuple consisting of a difference and the index:
(math.fabs(a - b), i)
The whole thing inbetween the min(...) is a generator expression. min then finds the minimal value in these values, and the assignment unpacks them:
min_diff, min_idx = min(...)

Categories