Hello fellow stackoverflowers, I am practising my Python with an example question given to me (actually a Google interview practice question) and ran into a problem I did not know how to a) pose properly (hence vague title), b) overcome.
The question is: For an array of numbers (given or random) find unique pairs of numbers within the array which when summed give a given number. E.G: find the pairs of numbers in the array below which add to 6.
[1 2 4 5 11]
So in the above case:
[1,5] and [2,4]
The code I have written is:
from secrets import *
i = 10
x = randbelow(10)
number = randbelow(100) #Generate a random number to be the sum that we are after#
if number == 0:
pass
else:
number = number
array = []
while i>0: #Generate a random array to use#
array.append(x)
x = x + randbelow(10)
i -= 1
print("The following is a randomly generated array:\n" + str(array))
print("Within this array we are looking for a pair of numbers which sum to " + str(number))
for i in range(0,10):
for j in range(0,10):
if i == j or i>j:
pass
else:
elem_sum = array[i] + array[j]
if elem_sum == number:
number_one = array[i]
number_two = array[j]
print("A pair of numbers within the array which satisfy that condition is: " + str(number_one) + " and " + str(number_two))
else:
pass
If no pairs are found, I want the line "No pairs were found". I was thinking a try/except, but wasn't sure if it was correct or how to implement it. Also, I'm unsure on how to stop repeated pairs appearing (unique pairs only), so for example if I wanted 22 as a sum and had the array:
[7, 9, 9, 13, 13, 14, 23, 32, 41, 45]
[9,13] would appear twice
Finally forgive me if there are redundancies/the code isn't written very efficiently, I'm slowly learning so any other tips would be greatly appreciated!
Thanks for reading :)
You can simply add a Boolean holding the answer to "was at least one pair found?".
initialize it as found = false at the beginning of your code.
Then, whenever you find a pair (the condition block that holds your current print command), just add found = true.
after all of your search (the double for loop`), add this:
if not found:
print("No pairs were found")
Instead of actually comparing each pair of numbers, you can just iterate the list once, subtract the current number from the target number, and see if the remainder is in the list. If you convert the list to a set first, that lookup can be done in O(1), reducing the overall complexity from O(n²) to just O(n). Also, the whole thing can be done in a single line with a list comprehension:
>>> nums = [1, 2, 4, 5, 11]
>>> target = 6
>>> nums_set = set(nums)
>>> pairs = [(n, target-n) for n in nums_set if target-n in nums_set and n <= target/2]
>>> print(pairs)
[(1, 5), (2, 4)]
For printing the pairs or some message, you can use the or keyword. x or y is interpreted as x if x else y, so if the result set is empty, the message is printed, otherwise the result set itself.
>>> pairs = []
>>> print(pairs or "No pairs found")
No pairs found
Update: The above can fail, if the number added to itself equals the target, but is only contained once in the set. In this case, you can use a collections.Counter instead of a set and check the multiplicity of that number first.
>>> nums = [1, 2, 4, 5, 11, 3]
>>> nums_set = set(nums)
>>> [(n, target-n) for n in nums_set if target-n in nums_set and n <= target/2]
[(1, 5), (2, 4), (3, 3)]
>>> nums_counts = collections.Counter(nums)
>>> [(n, target-n) for n in nums_counts if target-n in nums_counts and n <= target/2 and n != target-n or nums_counts[n] > 1]
[(1, 5), (2, 4)]
List your constraints first!
numbers added must be unique
only 2 numbers can be added
the length of the array can be arbitrary
the number to be summed to can be arbitrary
& Don't skip preprocessing! Reduce your problem-space.
2 things off the bat:
Starting after your 2 print statements, the I would do array = list(set(array)) to reduce the problem-space to [7, 9, 13, 14, 23, 32, 41, 45].
Assuming that all the numbers in question will be positive, I would discard numbers above number. :
array = [x for x in array if x < number]
giving [7, 9, 9, 13, 13, 14]
Combine the last 2 steps into a list comprehension and then use that as array:
smaller_array = [x for x in list(set(array)) if x < number]
which gives array == [7, 9, 13, 14]
After these two steps, you can do a bunch of stuff. I'm fully aware that I haven't answered your question, but from here you got this. ^this is the kind of stuff I'd assume google wants to see.
Related
This question already has answers here:
Find pairs in two arrays such that when multiplied becomes a perfect square
(4 answers)
Closed 3 years ago.
You are given two lists and you have to find out the pairs which make a perfect square.
For example in:
a = [2, 6, 10, 13, 17, 18]
b = [3, 7, 8, 9, 11, 15]
There are two pairs (2,8) and (8,18).
Is there any method efficient than the brute-force?
Here is my code which has a time complexity of O(n*m)
(where n is the length of a and m is the length of b).
pl = []
a = [ 2, 6, 10, 13, 17,18]
b = [ 3, 7, 8, 9, 11, 15 ]
i = 0
while(i < len(a)):
j = 0
while(j < len(b)):
p = a[i]*b[j]
n = p**0.5
u = int(n)
if n == u:
pl.append((a[i],b[j]))
j = j+1
i = i+1
print(pl)
This question has been asked before using C# here, but I don't get what they mean by "All we would need to store for each number is which of its prime factors have an odd count", so I can't implement this in my Python code.
Can someone explain to me how we might implement an efficient solution in Python?
The logic described in the linked question is as follows. A perfect square's prime factors will always be in pairs. For example, 36 = 2*2*3*3. We have two 3s and two 2s. Therefore, if we take any two numbers that multiply to form a perfect square, if we take the combined counts of each of their prime factors, we will also get even counts.
For example, 8 = 2*2*2, and 18 = 2*3*3. Combined, we get four 2s and two 3s.
Below is some Python code that implements this algorithm, using collections.Counter and sets to remove duplicates. First, we precompute all prime factorizations for each unique element in a and b to avoid redundancy, and store this in a dictionary. Then, we loop over pairs of elements from a and b, using itertools.product, and combine the prime factor counts. If every count is even, we add the sorted pair to a set.
Code:
from collections import Counter
import itertools
def prime_factors(n):
"""
https://stackoverflow.com/a/22808285/12366110
"""
i = 2
factors = []
while i * i <= n:
if n % i:
i += 1
else:
n //= i
factors.append(i)
if n > 1:
factors.append(n)
return Counter(factors)
a = [2, 6, 10, 13, 17,18]
b = [3, 7, 8, 9, 11, 15]
prime_factors = {i: prime_factors(i) for i in set(a) | set(b)}
rv = set()
for a, b in itertools.product(a, b):
combined_counts = prime_factors[a] + prime_factors[b]
if all(v%2 == 0 for v in combined_counts.values()):
rv.add(tuple(sorted([a, b])))
Output:
>>> rv
{(2, 8), (8, 18)}
I am trying to find elements from array(integer array) or list which are unique and those elements must not divisible by any other element from same array or list.
You can answer in any language like python, java, c, c++ etc.
I have tried this code in Python3 and it works perfectly but I am looking for better and optimum solution in terms of time complexity.
assuming array or list A is already sorted and having unique elements
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
while i<len(A)-1:
while j<len(A):
if A[j]%A[i]==0:
A.pop(j)
else:
j+=1
i+=1
j=i+1
For the given array A=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] answer would be like ans=[2,3,5,7,11,13]
another example,A=[4,5,15,16,17,23,39] then ans would be like, ans=[4,5,17,23,39]
ans is having unique numbers
any element i from array only exists if (i%j)!=0, where i!=j
I think it's more natural to do it in reverse, by building a new list containing the answer instead of removing elements from the original list. If I'm thinking correctly, both approaches do the same number of mod operations, but you avoid the issue of removing an element from a list.
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
ans = []
for x in A:
for y in ans:
if x % y == 0:
break
else: ans.append(x)
Edit: Promoting the completion else.
This algorithm will perform much faster:
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
if (A[-1]-A[0])/A[0] > len(A)*2:
result = list()
for v in A:
for f in result:
d,m = divmod(v,f)
if m == 0: v=0;break
if d<f: break
if v: result.append(v)
else:
retain = set(A)
minMult = 1
maxVal = A[-1]
for v in A:
if v not in retain : continue
minMult = v*2
if minMult > maxVal: break
if v*len(A)<maxVal:
retain.difference_update([m for m in retain if m >= minMult and m%v==0])
else:
retain.difference_update(range(minMult,maxVal,v))
if maxVal%v == 0:
maxVal = max(retain)
result = list(retain)
print(result) # [2, 3, 5, 7, 11, 13]
In the spirit of the sieve of Eratostenes, each number that is retained, removes its multiples from the remaining eligible numbers. Depending on the magnitude of the highest value, it is sometimes more efficient to exclude multiples than check for divisibility. The divisibility check takes several times longer for an equivalent number of factors to check.
At some point, when the data is widely spread out, assembling the result instead of removing multiples becomes faster (this last addition was inspired by Imperishable Night's post).
TEST RESULTS
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] (100000 repetitions)
Original: 0.55 sec
New: 0.29 sec
A = list(range(2,5000))+[9697] (100 repetitions)
Original: 3.77 sec
New: 0.12 sec
A = list(range(1001,2000))+list(range(4000,6000))+[9697**2] (10 repetitions)
Original: 3.54 sec
New: 0.02 sec
I know that this is totally insane but i want to know what you think about this:
A = [4,5,15,16,17,23,39]
prova=[[x for x in A if x!=y and y%x==0] for y in A]
print([A[idx] for idx,x in enumerate(prova) if len(prova[idx])==0])
And i think it's still O(n^2)
If you care about speed more than algorithmic efficiency, numpy would be the package to use here in python:
import numpy as np
# Note: doesn't have to be sorted
a = [2, 2, 3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 16, 29, 29]
a = np.unique(a)
result = a[np.all((a % a[:, None] + np.diag(a)), axis=0)]
# array([2, 3, 5, 7, 11, 13, 29])
This divides all elements by all other elements and stores the remainder in a matrix, checks which columns contain only non-0 values (other than the diagonal), and selects all elements corresponding to those columns.
This is O(n*M) where M is the max size of an integer in your list. The integers are all assumed to be none negative. This also assumes your input list is sorted (came to that assumption since all lists you provided are sorted).
a = [4, 7, 7, 8]
# a = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
# a = [4, 5, 15, 16, 17, 23, 39]
M = max(a)
used = set()
final_list = []
for e in a:
if e in used:
continue
else:
used.add(e)
for i in range(e, M + 1):
if not (i % e):
used.add(i)
final_list.append(e)
print(final_list)
Maybe this can be optimized even further...
If the list is not sorted then for the above method to work, one must sort it. The time complexity will then be O(nlogn + Mn) which equals to O(nlogn) when n >> M.
My self-learning task is to find how many sequences are on the list. A sequence is a group of numbers, where each is one 1 bigger than the previous one. So, in the list:
[1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
there are 3 sequences:
1,2,3
12,13,14,15
23,24,25,26
I've spent few hours and got a solution, which I think is a workaround rather than the real solution.
My solution is to have a separate list for adding sequences and count the attempts to update this list. I count the very first appending, and every new appending except for the sequence, which already exists.
I believe there is a solution without additional list, which allows to count the sequences itself rather than the list manipulation attempts.
numbers = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
goods = []
count = 0
for i in range(len(numbers)-1):
if numbers[i] + 1 == numbers[i+1]:
if goods == []:
goods.append(numbers[i])
count = count + 1
elif numbers[i] != goods[-1]:
goods.append(numbers[i])
count = count + 1
if numbers[i+1] != goods[-1]:
goods.append(numbers[i+1])
The output from my debugging:
Number 1 added to: [1]
First count change: 1
Number 12 added to: [1, 2, 3, 12]
Normal count change: 2
Number 23 added to: [1, 2, 3, 12, 13, 14, 15, 23]
Normal count change: 3
Thanks everyone for your help!
Legman suggested the original solution I failed to implemented before I end up with another solution in this post.
MSeifert helped to find a the right way with the lists:
numbers = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
print("Numbers:", numbers)
goods = []
count = 0
for i in range(len(numbers)-1):
if numbers[i] + 1 == numbers[i+1]:
if goods == []:
goods.append([numbers[i]])
count = count + 1
elif numbers[i] != goods[-1][-1]:
goods.append([numbers[i]])
count = count + 1
if numbers[i+1] != goods[-1]:
goods[-1].extend([numbers[i+1]])
print("Sequences:", goods)
print("Number of sequences:", len(goods))
One way would be to iterate over pairwise elements:
l = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
res = [[]]
for item1, item2 in zip(l, l[1:]): # pairwise iteration
if item2 - item1 == 1:
# The difference is 1, if we're at the beginning of a sequence add both
# to the result, otherwise just the second one (the first one is already
# included because of the previous iteration).
if not res[-1]: # index -1 means "last element".
res[-1].extend((item1, item2))
else:
res[-1].append(item2)
elif res[-1]:
# The difference isn't 1 so add a new empty list in case it just ended a sequence.
res.append([])
# In case "l" doesn't end with a "sequence" one needs to remove the trailing empty list.
if not res[-1]:
del res[-1]
>>> res
[[1, 2, 3], [12, 13, 14, 15], [23, 24, 25, 26]]
>>> len(res) # the amount of these sequences
3
A solution without zip only requires small changes (the loop and the the beginning of the loop) compared to the approach above:
l = [1,2,3,5,8,10,12,13,14,15,17,19,21,23,24,25,26]
res = [[]]
for idx in range(1, len(l)):
item1 = l[idx-1]
item2 = l[idx]
if item2 - item1 == 1:
if not res[-1]:
res[-1].extend((item1, item2))
else:
res[-1].append(item2)
elif res[-1]:
res.append([])
if not res[-1]:
del res[-1]
Taken from python itertools documentation, as demonstrated here you can use itemgetter and groupby to do that using only one list, like so:
>>> from itertools import groupby
>>> from operator import itemgetter
>>>
>>> l = [1, 2, 3, 5, 8, 10, 12, 13, 14, 15, 17, 19, 21, 23, 24, 25, 26]
>>>
>>> counter = 0
>>> for k, g in groupby(enumerate(l), lambda (i,x):i-x):
... seq = map(itemgetter(1), g)
... if len(seq)>1:
... print seq
... counter+=1
...
[1, 2, 3]
[12, 13, 14, 15]
[23, 24, 25, 26]
>>> counter
3
Notice: As correctly mentioned by #MSeifert, tuple unpacking in the signature is only possible in Python 2 and it will fail on Python 3 - so this is a python 2.x solution.
This could be solved with dynamic programming. If you only want to know the number of sequences and don't actually need to know what the sequences are you should be able to do this with only a couple of variables. Realistically, as you're going through the list you only really need to know if you are currently in a sequence, if not if the next one is incremented by 1 making this the beginning of a sequence and if so is the next one greater than 1 making it the exit of a sequence. After that, you just need to make sure to end the loop one cell before the end of the list since the last cell cant form a sequence by itself and so that it doesn't cause an error when you're performing a check. Below is example code
isSeq=false
for i in range(len(numbers)-1):
if isSeq==false:
if numbers[i]+1==numbers[i+1]:
isSeq=true
count=count+1
elif
if numbers[i]+1!=numbers[i+1]:
isSeq=false
Here is a link to a dynamic programming tutorial.
https://www.codechef.com/wiki/tutorial-dynamic-programming
I have a list of numbers for input, e.g.
671.00
1,636.00
436.00
9,224.00
and I want to generate all possible sums with a way to id it for output, e.g.:
671.00 + 1,636.00 = 2,307.00
671.00 + 436.00 = 1,107.00
671.00 + 9,224.00 = 9,224.00
671.00 + 1,636.00 + 436.00 = 2,743.00
...
and I would like to do it in Python
My current constrains are:
a) I'm just learning python now (that's part of the idea)
b) I will have to use Python 2.5.2 (no intertools)
I think I have found a piece of code that may help:
def all_perms(str):
if len(str) <=1:
yield str
else:
for perm in all_perms(str[1:]):
for i in range(len(perm)+1):
#nb str[0:1] works in both string and list contexts
yield perm[:i] + str[0:1] + perm[i:]
( from these guys )
But I'm not sure how to use it in my propose.
Could someone trow some tips and pieces of code of help?
cheers,
f.
Permutations are about taking an ordered set of things and moving these things around (i.e. changing order). Your question is about combinations of things from your list.
Now, an easy way of enumerating combinations is by mapping entries from your list to bits in a number. For example, lets assume that if bit #0 is set (i.e. 1), then number lst[0] participates in the combination, if bit #1 is set, then lst[1] participates in the combination, etc. This way, numbers in range 0 <= n < 2**(len(lst)) identify all possible combinations of lst members, including an empty one (n = 0) and the whole lst (n = 2**(len(lst)) - 1).
You need only combinations of 2 items or more, i.e. only those combination IDs that have at least two nonzero bits in their binary representation. Here is how to identify these:
def HasAtLeastTwoBitsSet(x) :
return (x & (x-1)) != 0
# Testing:
>>> [x for x in range(33) if HasAtLeastTwoBitsSet(x)]
[3, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
Next step is to extract a combination of list members identified by a combination id. This is easy, thanks to the power of list comprehensions:
def GetSublistByCombination(lst, combination_id) :
res = [x for (i,x) in enumerate(lst) if combination_id & (1 << i)]
return res
# Testing:
>>> GetSublistByCombination([0,1,2,3], 1)
[0]
>>> GetSublistByCombination([0,1,2,3], 3)
[0, 1]
>>> GetSublistByCombination([0,1,2,3], 12)
[2, 3]
>>> GetSublistByCombination([0,1,2,3], 15)
[0, 1, 2, 3]
Now let's make a generator that produces all sums, together with their string representations:
def IterAllSums(lst) :
combinations = [i for i in range(1 << len(lst)) if HasAtLeastTwoBitsSet(i)]
for comb in combinations :
sublist = GetSublistByCombination(lst, comb)
sum_str = '+'.join(map(str, sublist))
sum_val = sum(sublist)
yield (sum_str, sum_val)
And, finally, let's use it:
>>> for sum_str, sum_val in IterAllSums([1,2,3,4]) : print sum_str, sum_val
1+2 3
1+3 4
2+3 5
1+2+3 6
1+4 5
2+4 6
1+2+4 7
3+4 7
1+3+4 8
2+3+4 9
1+2+3+4 10
The code below generates all "subsets" of a given list (except the empty set), i.e. it returns a list of lists.
def all_sums(l): #assumes that l is non-empty
if len(l)==1:
return ([[l[0]]])
if len(l)==0:
return []
result = []
for i in range(0,len(l)):
result.append([l[i]])
for p in all_sums(l[i+1:]):
result.append([l[i]]+p)
return result
Now you could just write a short function doit for output also:
def doit(l):
mylist = all_sums(l)
print mylist
for i in mylist:
print str(i) + " = " + str(sum(i))
doit([1,2,3,4])
With itertools (Python >=2.6) would be:
from itertools import *
a=[1,2,3,4]
sumVal=[tuple(imap(sum,combinations(a,i))) for i in range(2,len(a)+1)]
I have a sumranges() function, which sums all the ranges of consecutive numbers found in a tuple of tuples. To illustrate:
def sumranges(nums):
return sum([sum([1 for j in range(len(nums[i])) if
nums[i][j] == 0 or
nums[i][j - 1] + 1 != nums[i][j]]) for
i in range(len(nums))])
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sumranges(nums)
7
As you can see, it returns the number of ranges of consecutive digits within the tuple, that is: len((1, 2, 3, 4), (1), (5, 6), (19, 20), (24), (29), (400)) = 7. The tuples are always ordered.
My problem is that my sumranges() is terrible. I hate looking at it. I'm currently just iterating through the tuple and each subtuple, assigning a 1 if the number is not (1 + previous number), and summing the total. I feel like I am missing a much easier way to accomplish my stated objective. Does anyone know a more pythonic way to do this?
Edit: I have benchmarked all the answers given thus far. Thanks to all of you for your answers.
The benchmarking code is as follows, using a sample size of 100K:
from time import time
from random import randrange
nums = [sorted(list(set(randrange(1, 10) for i in range(10)))) for
j in range(100000)]
for func in sumranges, alex, matt, redglyph, ephemient, ferdinand:
start = time()
result = func(nums)
end = time()
print ', '.join([func.__name__, str(result), str(end - start) + ' s'])
Results are as follows. Actual answer shown to verify that all functions return the correct answer:
sumranges, 250281, 0.54171204567 s
alex, 250281, 0.531121015549 s
matt, 250281, 0.843333005905 s
redglyph, 250281, 0.366822004318 s
ephemient, 250281, 0.805964946747 s
ferdinand, 250281, 0.405596971512 s
RedGlyph does edge out in terms of speed, but the simplest answer is probably Ferdinand's, and probably wins for most pythonic.
My 2 cents:
>>> sum(len(set(x - i for i, x in enumerate(t))) for t in nums)
7
It's basically the same idea as descriped in Alex' post, but using a set instead of itertools.groupby, resulting in a shorter expression. Since sets are implemented in C and len() of a set runs in constant time, this should also be pretty fast.
Consider:
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> flat = [[(x - i) for i, x in enumerate(tu)] for tu in nums]
>>> print flat
[[1, 1, 1, 1], [1, 4, 4], [19, 19, 22, 26, 396]]
>>> import itertools
>>> print sum(1 for tu in flat for _ in itertools.groupby(tu))
7
>>>
we "flatten" the "increasing ramps" of interest by subtracting the index from the value, turning them into consecutive "runs" of identical values; then we identify and could the "runs" with the precious itertools.groupby. This seems to be a pretty elegant (and speedy) solution to your problem.
Just to show something closer to your original code:
def sumranges(nums):
return sum( (1 for i in nums
for j, v in enumerate(i)
if j == 0 or v != i[j-1] + 1) )
The idea here was to:
avoid building intermediate lists but use a generator instead, it will save some resources
avoid using indices when you already have selected a subelement (i and v above).
The remaining sum() is still necessary with my example though.
Here's my attempt:
def ranges(ls):
for l in ls:
consec = False
for (a,b) in zip(l, l[1:]+(None,)):
if b == a+1:
consec = True
if b is not None and b != a+1:
consec = False
if consec:
yield 1
'''
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sum(ranges(nums))
7
'''
It looks at the numbers pairwise, checking if they are a consecutive pair (unless it's at the last element of the list). Each time there's a consecutive pair of numbers it yields 1.
This could probably be put together in a more compact form, but I think clarity would suffer:
def pairs(seq):
for i in range(1,len(seq)):
yield (seq[i-1], seq[i])
def isadjacent(pair):
return pair[0]+1 == pair[1]
def sumrange(seq):
return 1 + sum([1 for pair in pairs(seq) if not isadjacent(pair)])
def sumranges(nums):
return sum([sumrange(seq) for seq in nums])
nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
print sumranges(nums) # prints 7
You could probably do this better if you had an IntervalSet class because then you would scan through your ranges to build your IntervalSet, then just use the count of set members.
Some tasks don't always lend themselves to neat code, particularly if you need to write the code for performance.
There is a formula for this, the sum of the first n numbers, 1+ 2+ ... + n = n(n+1) / 2 . Then if you want to have the sum of i-j then it is (j(j+1)/2) - (i(i+1)/2) this I am sure simplifies but you can work that out. It might not be pythonic but it is what I would use.