Iterate over lists with a particular sum - python

I would like to iterate over all lists of length n whose elements sum to 2. How can you do this efficiently? Here is a very inefficient method for n = 10. Ultimately I would like to do this for `n > 25'.
n = 10
for L in itertools.product([-1,1], repeat = n):
if (sum(L) == 2):
print L #Do something with L

you only can have a solution of 2 if you have 2 more +1 than -1 so for n==24
a_solution = [-1,]*11 + [1,]*13
now you can just use itertools.permutations to get every permutation of this
for L in itertools.permutations(a_solution): print L
it would probably be faster to use itertools.combinations to eliminate duplicates
for indices in itertools.combinations(range(24),11):
a = numpy.ones(24)
a[list(indices)] = -1
print a
note for you to get 2 the list must be an even length

One way is to stop your product recursion whenever the remaining elements can't make up the target sum.
Specifically, this method would take your itertools.product(...,repeat) apart into a recursive generator, which updates the target sum based on the value of the current list element, and checks to see if the resulting target is achievable before recursing further:
def generate_lists(target, n):
if(n <= 0):
yield []
return
if(target > n or target < -n):
return
for element in [-1,1]:
for list in generate_lists(target-element, n-1):
yield list+[element]

Related

most efficient way to iterate over a large array looking for a missing element in Python

I was trying an online test. the test asked to write a function that given a list of up to 100000 integers whose range is 1 to 100000, would find the first missing integer.
for example, if the list is [1,4,5,2] the output should be 3.
I iterated over the list as follow
def find_missing(num)
for i in range(1, 100001):
if i not in num:
return i
the feedback I receives is the code is not efficient in handling big lists.
I am quite new and I couldnot find an answer, how can I iterate more efficiently?
The first improvement would be to make yours linear by using a set for the repeated membership test:
def find_missing(nums)
s = set(nums)
for i in range(1, 100001):
if i not in s:
return i
Given how C-optimized python sorting is, you could also do sth like:
def find_missing(nums)
s = sorted(set(nums))
return next(i for i, n in enumerate(s, 1) if i != n)
But both of these are fairly space inefficient as they create a new collection. You can avoid that with an in-place sort:
from itertools import groupby
def find_missing(nums):
nums.sort() # in-place
return next(i for i, (k, _) in enumerate(groupby(nums), 1) if i != k)
For any range of numbers, the sum is given by Gauss's formula:
# sum of all numbers up to and including nums[-1] minus
# sum of all numbers up to but not including nums[-1]
expected = nums[-1] * (nums[-1] + 1) // 2 - nums[0] * (nums[0] - 1) // 2
If a number is missing, the actual sum will be
actual = sum(nums)
The difference is the missing number:
result = expected - actual
This compulation is O(n), which is as efficient as you can get. expected is an O(1) computation, while actual has to actually add up the elements.
A somewhat slower but similar complexity approach would be to step along the sequence in lockstep with either a range or itertools.count:
for a, e in zip(nums, range(nums[0], len(nums) + nums[0])):
if a != e:
return e # or break if not in a function
Notice the difference between a single comparison a != e, vs a linear containment check like e in nums, which has to iterate on average through half of nums to get the answer.
You can use Counter to count every occurrence of your list. The minimum number with occurrence 0 will be your output. For example:
from collections import Counter
def find_missing():
count = Counter(your_list)
keys = count.keys() #list of every element in increasing order
main_list = list(range(1:100000)) #the list of values from 1 to 100k
missing_numbers = list(set(main_list) - set(keys))
your_output = min(missing_numbers)
return your_output

Find number of subset that satisfy these two conditions?

def findNumber(N,A,B):
return count
Count is total number of subsets of array - [1,2,3,...,N] satisfying these Conditions:
1. All subsets should be contiguous.
2. No subset should contain A[i] and B[i] (order doesn't matter).
Example
N = 3, A=[2,1,3], B=[3,3,1]
All subsets = [1],[2],[3],[1,2],[2,3],[1,2,3]
Invalid subsets = [2,3] because A[0] and B[0] are in it. [1,2,3] because it contains A[1],B[1] and A[2],B[2]
so count will be 4.
I was able to figure out that total number of contiguous subsets will be N(N+1)/2 But i got stuck on how to satisfy condition 2.
I tried explaining it as best as i could please ask for clarification if needed.
EDIT
def findallvalid(n,a,b):
for w in range(1, n+1):
for i in range(n-w+1):
if not((a[0],b[0]) in (i+1,i+w+1)):
yield range(i+1,i+w+1)
I tried this code but i don't know how to iterate over all values of a and b without making this very slow. It's already to slow on n>10^2.
1<=n<=10^5
1<=len(A)<=10^6
I'm interested in how to approach this problem without generating subsets, for example I found total contiguous subsets will be n(n+1)/2 I just want to know how to know number of subsets to rule out.
That gave me an idea - indeed it is quite simple to compute the number of subsets ruled out by a single pair (A[i], B[i]). A little more challenging it is to do for multiple pairs, since the excluded subsets can overlap, so just subtracting a number for each pair won't work. What works is to have a set of the numbers or indexes of all N(N+1)/2 subsets, and remove the indexes of the excluded subsets from it; at the end, the cardinality of the reduced index set is the wanted number of remaining subsets.
def findNumber(N, A, B):
count = N*(N+1)//2
powerset = set(range(count)) # set of enumeration of possible intervals
for a, b in zip(A, B):
if a > b: a, b = b, a # let a be the lower number
# when a and b are in a subset, they form a sub-subset of length "span"
span = (b-a)+1
start = 0 # index where the invervals of current length w begin
for w in range(1, N+1): # for all interval lengths w
if span <= w: # if a and b can be in interval of length w
first = 0 if b <= w else b-w # index of first containment
last = a # index of last containment
# remove the intervals containing a and b from the enumeration
powerset -= set(range(start+first, start+last))
start += N+1-w # compute start index of next length w
return len(powerset) # number of remaining intervals
I did some small modification to your code,
this code is really slow, because it is iterating over the entire list which can be made of 10^5 items, and doing some nested operation which will make the complexity skyrocket up to 10^10, which is really slow
from collections import defaultdict
def findallvalid(N,A,B):
a_in = defaultdict(list)
b_in = defaultdict(list)
for idx, a in enumerate(A):
a_in[a].append(idx)
for idx, b in enumerate(B):
b_in[b].append(idx)
def diff_elem_index(subset):
indecies = []
for elem in subset:
indecies.extend(a_in[elem])
indecies.extend(b_in[elem])
return len(set(indecies)) == len(indecies)
for set_window in range(1, N+1):
for start_idx in range(N - set_window + 1):
sett = list(range(start_idx+1,start_idx + set_window + 1))
if diff_elem_index(sett):
yield sett
My closest assumption, since the code only needs to return the count of items
it can be solved mathematically
All contagious permutations of a N-size list is (N*(N+1))/2 + 1
after that you need to deduct the count of possible permutations that doesn't comply with the second condition, which can be figured out from list A and B
I think calculating the count excluded permutations from list A and B, will be much more efficient than going through all permutations from 1 to N.

Finding All permutations of a list when given a function that returns the next permutation of a list

In my assignment this week I was asked to write a python script that takes a number n and returns all permutations of [0,1,2,...,n-1]. So far I have written a script that takes a list and returns the next permutation of the list. I am looking for ideas on how I can write the script based on what I've written so far.
def next_permutation(p):
a = len(p)
i = a -2
while i >= 0 and p[i] >= p[i+1]:
i = i-1
if i == -1:
return []
j = i+1
while j < a and p[j] >= p[i]:
j += 1
j-=1
p[i], p[j] = p[j], p[i]
k = i + 1
l = a - 1
while k < l:
p[k], p[l] = p[l], p[k]
k += 1
l -= 1
return p
EDIT: this is the code that returns the next permutation of a list. I wrote this entirely based on the instruction provided by my instructor.
Since you want to have all the permutations of a list with numbers from 0 to n-1, you already have clear steps that you need to take:
Create a list that contains all numbers from 0 to n-1:
This can be easily done with the in-built range() function since it is mostly used for exactly that purpose:
range(stop)
This is a versatile function to create iterables yielding arithmetic progressions.
Calculate the amount of permutations that such list would have:
Math tells us that having N elements, there will be N! different permutations of those elements, wher ! means factorial. We can import factorial function from math module which would quickly allow us to calculate the amount of permutations your list will have:
from math import factorial
print(factorial(4)) # 24
Call your function next_permutation(p) that you already wrote that many times and yield each and every permutation.
To return something more than once from a function, you can use yield insted.
With these steps in mind, you can create something similar to this:
def all_permutations(n):
# Constructing a list that contains all numbers from 0 to n-1
integer_list = list(range(n))
# Calculating the amount of permutations such list would have
permutation_count = factorial(n)
# Output that many permutations
for _ in range(permutation_count):
yield integer_list
integer_list = next_permutation(integer_list)
This generator function will yield all permutations of a list containing numbers from 0 to n-1 which is exactly what you need.
To create a list that would contain all of the permutations you can write something simple like:
n = 4
all_permutations = list(all_permutations(n))

Find the Duplicate Number

Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.
My solution:
def findDuplicate(nums):
slow = fast = finder = 0
while fast is not None:
slow = nums[slow]
fast = nums[nums[fast]]
if fast is slow:
return slow
return False
nums = [1,2,2,3,4]
print findDuplicate(nums)
My above solution works and gives me o/p 2 but it doesn't work for every input for example it doesn't work for [11,15,17,17,14] or [3,1,2,6,2,3] and gives me error IndexError: list index out of range. I am not able to find patterns and am not able to track down the exact problem. Also tried to change my while condition:
while fast is not None and nums[nums[fast]] is not None:
your help will be greatly appreciated! Thank you.
Since the numbers are between 1 and n and you have been told there is only one duplicate, you can use difference between the sum of the numbers in the array and the sum of numbers from 1 to n to get the duplicate.
def findDuplicate(l):
n = len(l) - 1 # Get n as length of list - 1
return sum(l) - (n * (n + 1) / 2) # n*(n+1)/2 is the sum of integers from 1 to n
So the duplicate is the sum of the list - n*(n+1)/2
Of course, this doesn't generalize to finding duplicates for any list. For that case, you need to use #Jalepeno112 's answer.
The fact that the first one works is a fluke. Let's look at what it does on the first pass.
nums = [1,2,2,3,4]
# slow starts as index 0. So now, you've reassigned slow to be nums[0] which is 1.
# so slow equals 1
slow = nums[slow]
# now you are saying that fast equals nums[nums[0]].
# nums[0] is 1. nums[1] is 2
# so fast = 2
fast = nums[nums[fast]]
On the next pass, slow will be nums[1] which is 2. fast will be nums[nums[2]] which is nums[2] which is 2. At this point slow and fast are equal.
In your second example, you are getting an IndexError because of fast = nums[nums[fast]] If the value at nums[fast] is not a valid index, then this code will fail. Specifically in the second example, nums[0] is 11. nums doesn't have an element at index 11, so you get an error.
What you really want to be doing is performing a nested for loop on the array:
# range(0,len(nums)-1) will give a list of numbers from [0, to the length of nums-1)
# range(1, len(nums)) does the same,
# except it will start at 1 more than i is currently at (the next element in the array).
# So it's range is recomputed on each outer loop to be [i+1, length of nums)
for i in range(0,len(nums)-1):
for j in range(i+1,len(nums)):
# if we find a matching element, return it
if nums[i] == nums[j]:
return nums[i]
# if we don't find anything return False
return False
There are likely other more Pythonic ways to achieve this, but that wasn't your original question.
first you must ensure all numbers in list satisfy your constrains.
to find duplicated numbers in a list Use Counter in collections it will return each number and number of occurrence example :
>>> from collections import Counter
>>> l=Counter([11,15,17,17,14])
>>> l
Counter({17: 2, 11: 1, 14: 1, 15: 1})
to get the most common one use :
>>> l.most_common(n=1)
[(17, 2)]
where n is the number most common numbers you want to get
def duplicates(num_list):
if type(num_list) is not list:
print('No list provided')
return
if len(num_list) is 0 or len(num_list) is 1:
print('No duplicates')
return
for index,numA in enumerate(num_list):
num_len = len(num_list)
for indexB in range(index+1, num_len):
if numA == num_list[indexB]:
print('Duplicate Number:'+str(numA))
return
duplicates([11,15,17,17,14])
duplicates([3,1,2,6,2,3])
duplicates([])
duplicates([5])
l=[]
n= int(input("the number of digit is :"))
l=[0 for k in range(n)]
for j in range(0,n):
l[j]=int(input("the component is"))
print(l)
b=0; c=0
for i in range(n):
if l[i]== l[n-1-i]:
b=1;c=i
if b==1:
print("duplicate found! it is",l[c])
elif b==0:
print("no duplicate")
The answer is unfinished. It tries to convert the array to a linked list. So far it found where the slow pointer and fast pointer met, but this is the halfway solution. To get the solution, we need to initialize another pointer from the beginning of the linked list and walk towards each other. When they meet, that point is the where cycle is detected, in our question it is where the single point is:
class Solution:
def findDuplicate(self, nums: List[int]) -> int:
slow,fast=0,0
while True:
slow=nums[slow]
fast=nums[nums[fast]]
if slow==fast:
break
slow2=0
while True:
slow2=nums[slow2]
slow=nums[slow]
if slow==slow2:
return slow2

Efficient way to get every integer vectors summing to a given number [duplicate]

I've been working on some quick and dirty scripts for doing some of my chemistry homework, and one of them iterates through lists of a constant length where all the elements sum to a given constant. For each, I check if they meet some additional criteria and tack them on to another list.
I figured out a way to meet the sum criteria, but it looks horrendous, and I'm sure there's some type of teachable moment here:
# iterate through all 11-element lists where the elements sum to 8.
for a in range(8+1):
for b in range(8-a+1):
for c in range(8-a-b+1):
for d in range(8-a-b-c+1):
for e in range(8-a-b-c-d+1):
for f in range(8-a-b-c-d-e+1):
for g in range(8-a-b-c-d-e-f+1):
for h in range(8-a-b-c-d-e-f-g+1):
for i in range(8-a-b-c-d-e-f-g-h+1):
for j in range(8-a-b-c-d-e-f-g-h-i+1):
k = 8-(a+b+c+d+e+f+g+h+i+j)
x = [a,b,c,d,e,f,g,h,i,j,k]
# see if x works for what I want
Here's a recursive generator that yields the lists in lexicographic order. Leaving exact as True gives the requested result where every sum==limit; setting exact to False gives all lists with 0 <= sum <= limit. The recursion takes advantage of this option to produce the intermediate results.
def lists_with_sum(length, limit, exact=True):
if length:
for l in lists_with_sum(length-1, limit, False):
gap = limit-sum(l)
for i in range(gap if exact else 0, gap+1):
yield l + [i]
else:
yield []
Generic, recursive solution:
def get_lists_with_sum(length, my_sum):
if my_sum == 0:
return [[0 for _ in range(length)]]
if not length:
return [[]]
elif length == 1:
return [[my_sum]]
else:
lists = []
for i in range(my_sum+1):
rest = my_sum - i
sublists = get_lists_with_sum(length-1, rest)
for sl in sublists:
sl.insert(0, i)
lists.append(sl)
return lists
print get_lists_with_sum(11, 8)

Categories