generator with conditions in python - python

which is taken from the predefined interval [a=10000, b=99999]
will fill the list (lst) with a random value, taking into account the following principle: on an even index
The sum of the standing digits is equal to the sum of the standing digits on the odd index

Here is a hit-or-miss generator that works well for numbers in your range:
import random
def even_odd_places(n):
evens = []
odds = []
for i,d in enumerate(str(n),start = 1):
if i % 2 == 0:
evens.append(int(d))
else:
odds.append(int(d))
return evens,odds
def check_condition(n):
e,o = even_odd_places(n)
return sum(e) == sum(o)
def rand_nums(a,b,max_tries = 1000):
tries = 0
while True:
tries += 1
n = random.randint(a,b)
if check_condition(n):
tries = 0
yield n
else:
if tries == max_tries:
return None
For example:
g = rand_nums(10000, 99999)
s = [next(g) for _ in range(10000)]
print(s[:5])
#typical output: [25641, 45980, 38225, 34320, 94380]
Something better than a hit-or-miss approach would require some nontrivial mathematical analysis. There is no easy way to generate all numbers in a given range which satisfy the target condition.

Related

Performance improvement for calculating the powerset of a list of integers

I am trying to compute the powerset of a list of prime numbers. I have already done some research and the prefered way of doing this seems to be using a line like
itertools.chain.from_iterable(itertools.combinations(primes, r) for r in range(2, len(primes) + 1))
and then iterating over all combinations to get the products with math.prod(). All in all, the code currently looks like this:
number = 200
p1 = []
# calculate all primes below specified number
for i in range(2, number + 1):
isPrime = True
for prime in p1:
if i % prime == 0:
isPrime = False
if isPrime:
p1.append(i)
Pp = []
myIterable = itertools.chain.from_iterable(itertools.combinations(p1, r) for r in range(2, len(p1) + 1))
# convert iterable to integer array of products -- The code below is extremely slow and should be improved
for x in myIterable:
newValue = math.prod(x)
if newValue <= number:
Pp.append(newValue)
This works, but it is not feasible for any "number" greater than 100 because of too high execution time. The problem is the last for loop, which takes forever to compute. Everything else performs reasonably well. The powerset has to be constricted to sets, whos products are less or equal to number, as done using the last if statement, or else the memory will explode.
The solution to this problem was to create a pointer array, which crawls through the prime array until the product of the pointed primes gets too high. The needed helper functions can be implemented like this:
def calcProductOfPointers(pointerArray, dataArray):
prod = 1
for pointer in pointerArray:
prod *= dataArray[pointer]
return prod
def incrementPointer(pointerArray, dataArray, threshold):
ret = False
for i in range(1, len(pointerArray) + 1):
index = len(pointerArray) - i
pointerArray[index] += 1
if calcProductOfPointers(pointerArray, dataArray) <= threshold and pointerArray[index] < len(dataArray):
ret = True
break
elif index > 0:
pointerArray[index] = pointerArray[index - 1] + 2
else:
break
return ret
And then the iteration over all powersets can be substituted with this code:
Pp = []
for i in range(2, len(p1) + 1): # start at a minimum of 2 prime factors
primePointers = []
for index in range(i):
primePointers.append(index)
if calcProductOfPointers(primePointers, p1) > number:
break
while calcProductOfPointers(primePointers, p1) <= number:
Pp.append(calcProductOfPointers(primePointers, p1))
if not incrementPointer(primePointers, p1, number):
break

Is there a way to make this reverse factorial code run more efficiently

I am just starting to learn python and made a program where it calculates the factorial number based on the factorial.
For example if I give the program the number 120 it will tell me it's factorial is 5
anyways my question is how can I make this code more efficient and faster.
Num = int(input())
i=0
for i in range(0,Num):
i = i + 1
x = Num/i
Num = x
if (x==1):
print(i)
Multiplications are much faster than divisions. You should try to reach the number with a factorial instead of dividing it iteratively:
def unfactorial(n):
f,i = 1,1
while f < n:
i += 1
f *= i
return i if f == n else None
unfactorial(120) # 5
A few things you can do:
Num = int(input())
i=0 # your for loop will initialize i, you don't need to do this here
for i in range(0,Num):
i = i + 1 # your for loop will increment i, no need to do this either
x = Num/i # you don't need the extra variable 'x' here
Num = x
if (x==1):
print(i)
You can rewrite this to look something like:
for index in range(1, number): # start range at 1
number /= index # this means; number = number / index
if number==1:
return index
Compute the factorials in ascending order until you reach (or exceed) the factorial you are looking for, using the previous factorial to efficiently compute the next.
def reverse_factorial(num):
i = 1
while num > 1:
i += 1
num /= i
return i
print(reverse_factorial(int(input())))

a function that returns number of sums of a certain number.py

I need to write a function that returns the number of ways of reaching a certain number by adding numbers of a list. For example:
print(p([3,5,8,9,11,12,20], 20))
should return:5
The code I wrote is:
def pow(lis):
power = [[]]
for lst in lis:
for po in power:
power = power + [list(po)+[lst]]
return power
def p(lst, n):
counter1 = 0
counter2 = 0
power_list = pow(lst)
print(power_list)
for p in power_list:
for j in p:
counter1 += j
if counter1 == n:
counter2 += 1
counter1 == 0
else:
counter1 == 0
return counter2
pow() is a function that returns all of the subsets of the list and p should return the number of ways to reach the number n. I keep getting an output of zero and I don't understand why. I would love to hear your input for this.
Thanks in advance.
There are two typos in your code: counter1 == 0 is a boolean, it does not reset anything.
This version should work:
def p(lst, n):
counter2 = 0
power_list = pow(lst)
for p in power_list:
counter1 = 0 #reset the counter for every new subset
for j in p:
counter1 += j
if counter1 == n:
counter2 += 1
return counter2
As tobias_k and Faibbus mentioned, you have a typo: counter1 == 0 instead of counter1 = 0, in two places. The counter1 == 0 produces a boolean object of True or False, but since you don't assign the result of that expression the result gets thrown away. It doesn't raise a SyntaxError, since an expression that isn't assigned is legal Python.
As John Coleman and B. M. mention it's not efficient to create the full powerset and then test each subset to see if it has the correct sum. This approach is ok if the input sequence is small, but it's very slow for even moderately sized sequences, and if you actually create a list containing the subsets rather than using a generator and testing the subsets as they're yielded you'll soon run out of RAM.
B. M.'s first solution is quite efficient since it doesn't produce subsets that are larger than the target sum. (I'm not sure what B. M. is doing with that dict-based solution...).
But we can enhance that approach by sorting the list of sums. That way we can break out of the inner for loop as soon as we detect a sum that's too high. True, we need to sort the sums list on each iteration of the outer for loop, but fortunately Python's TimSort is very efficient, and it's optimized to handle sorting a list that contains sorted sub-sequences, so it's ideal for this application.
def subset_sums(seq, goal):
sums = [0]
for x in seq:
subgoal = goal - x
temp = []
for y in sums:
if y > subgoal:
break
temp.append(y + x)
sums.extend(temp)
sums.sort()
return sum(1 for y in sums if y == goal)
# test
lst = [3, 5, 8, 9, 11, 12, 20]
total = 20
print(subset_sums(lst, total))
lst = range(1, 41)
total = 70
print(subset_sums(lst, total))
output
5
28188
With lst = range(1, 41) and total = 70, this code is around 3 times faster than the B.M. lists version.
A one pass solution with one counter, which minimize additions.
def one_pass_sum(L,target):
sums = [0]
cnt = 0
for x in L:
for y in sums[:]:
z = x+y
if z <= target :
sums.append(z)
if z == target : cnt += 1
return cnt
This way if n=len(L), you make less than 2^n additions against n/2 * 2^n by calculating all the sums.
EDIT :
A more efficient solution, that just counts ways. The idea is to see that if there is k ways to make z-x, there is k more way to do z when x arise.
def enhanced_sum_with_lists(L,target):
cnt=[1]+[0]*target # 1 way to make 0
for x in L:
for z in range(target,x-1,-1): # [target, ..., x+1, x]
cnt[z] += cnt[z-x]
return cnt[target]
But order is important : z must be considered descendant here, to have the good counts (Thanks to PM 2Ring).
This can be very fast (n*target additions) for big lists.
For example :
>>> enhanced_sum_with_lists(range(1,100),2500)
875274644371694133420180815
is obtained in 61 ms. It will take the age of the universe to compute it by the first method.
from itertools import chain, combinations
def powerset_generator(i):
for subset in chain.from_iterable(combinations(i, r) for r in range(len(i)+1)):
yield set(subset)
def count_sum(s, cnt):
return sum(1 for i in powerset_generator(s) if sum(k for k in i) == cnt)
print(count_sum(set([3,5,8,9,11,12,20]), 20))

Down to zero problem - getting time exceeded error

Trying to solve hackerrank problem.
You are given Q queries. Each query consists of a single number N. You can perform 2 operations on N in each move. If N=a×b(a≠1, b≠1), we can change N=max(a,b) or decrease the value of N by 1.
Determine the minimum number of moves required to reduce the value of N to 0.
I have used BFS approach to solve this.
a. Generating all prime numbers using seive
b. using prime numbers I can simply avoid calculating the factors
c. I enqueue -1 along with all the factors to get to zero.
d. I have also used previous results to not enqueue encountered data.
This still is giving me time exceeded. Any idea? Added comments also in the code.
import math
#find out all the prime numbers
primes = [1]*(1000000+1)
primes[0] = 0
primes[1] = 0
for i in range(2, 1000000+1):
if primes[i] == 1:
j = 2
while i*j < 1000000:
primes[i*j] = 0
j += 1
n = int(input())
for i in range(n):
memoize= [-1 for i in range(1000000)]
count = 0
n = int(input())
queue = []
queue.append((n, count))
while len(queue):
data, count = queue.pop(0)
if data <= 1:
count += 1
break
#if it is a prime number then just enqueue -1
if primes[data] == 1 and memoize[data-1] == -1:
queue.append((data-1, count+1))
memoize[data-1] = 1
continue
#enqueue -1 along with all the factors
queue.append((data-1, count+1))
sqr = int(math.sqrt(data))
for i in range(sqr, 1, -1):
if data%i == 0:
div = max(int(data/i), i)
if memoize[div] == -1:
memoize[div] = 1
queue.append((div, count+1))
print(count)
There are two large causes of slowness with this code.
Clearing an array is slower than clearing a set
The first problem is this line:
memoize= [-1 for i in range(1000000)]
this prepares 1 million integers and is executed for each of your 1000 test cases. A faster approach is to simply use a Python set to indicate which values have already been visited.
Unnecessary loop being executed
The second problem is this line:
if primes[data] == 1 and memoize[data-1] == -1:
If you have a prime number, and you have already visited this number, you actually do the slow loop searching for prime factors which will never find any solutions (because it is a prime).
Faster code
In fact, the improvement due to using sets is so much that you don't even need your prime testing code and the following code passes all tests within the time limit:
import math
n = int(input())
for i in range(n):
memoize = set()
count = 0
n = int(input())
queue = []
queue.append((n, count))
while len(queue):
data, count = queue.pop(0)
if data <= 1:
if data==1:
count += 1
break
if data-1 not in memoize:
memoize.add(data-1)
queue.append((data-1, count+1))
sqr = int(math.sqrt(data))
for i in range(sqr, 1, -1):
if data%i == 0:
div = max(int(data/i), i)
if div not in memoize:
memoize.add(div)
queue.append((div, count+1))
print(count)
Alternatively, there's a O(n*sqrt(n)) time and O(n) space complexity solution that passes all the test cases just fine.
The idea is to cache minimum counts for each non-negative integer number up to 1,000,000 (the maximum possible input number in the question) !!!BEFORE!!! running any query. After doing so, for each query just return a minimum count for a given number stored in the cache. So, retrieving a result will have O(1) time complexity per query.
To find minimal counts for each number (let's call it down2ZeroCounts), we should consider several cases:
0 and 1 have 0 and 1 minimal counts correspondingly.
Prime number p doesn't have factors other than 1 and itself. Hence, its minimal count is 1 plus a minimal count of p - 1 or more formally down2ZeroCounts[p] = down2ZeroCounts[p - 1] + 1.
For a composite number num it's a bit more complicated. For any pair of factors a > 1,b > 1 such that num = a*b the minimal count of num is either down2ZeroCounts[a] + 1 or down2ZeroCounts[b] + 1 or down2ZeroCounts[num - 1] + 1.
So, we can gradually build minimal counts for each number in ascending order. Calculating a minimal count of each consequent number will be based on optimal counts for lower numbers and so in the end a list of optimal counts will be built.
To better understand the approach please check the code:
from __future__ import print_function
import os
import sys
maxNumber = 1000000
down2ZeroCounts = [None] * 1000001
def cacheDown2ZeroCounts():
down2ZeroCounts[0] = 0
down2ZeroCounts[1] = 1
currentNum = 2
while currentNum <= maxNumber:
if down2ZeroCounts[currentNum] is None:
down2ZeroCounts[currentNum] = down2ZeroCounts[currentNum - 1] + 1
else:
down2ZeroCounts[currentNum] = min(down2ZeroCounts[currentNum - 1] + 1, down2ZeroCounts[currentNum])
for i in xrange(2, currentNum + 1):
product = i * currentNum
if product > maxNumber:
break
elif down2ZeroCounts[product] is not None:
down2ZeroCounts[product] = min(down2ZeroCounts[product], down2ZeroCounts[currentNum] + 1)
else:
down2ZeroCounts[product] = down2ZeroCounts[currentNum] + 1
currentNum += 1
def downToZero(n):
return down2ZeroCounts[n]
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
q = int(raw_input())
cacheDown2ZeroCounts()
for q_itr in xrange(q):
n = int(raw_input())
result = downToZero(n)
fptr.write(str(result) + '\n')
fptr.close()

Python: while loop inside else

def is_prime(x):
count = 1
my_list = []
while count > 0 and count < x:
if x % count == 0:
my_list.append(x/count)
count += 1
return my_list
my_list = is_prime(18)
def prime(x):
my_list2 = []
for number in my_list:
if number <= 2:
my_list2.append(number)
else:
count = 2
while count < number:
if number % count == 0:
break
else:
my_list2.append(number)
count += 1
return my_list2
print prime(18)
Just started out with Python. I have a very simple question.
This prints: [9, 3, 2].
Can someone please tell me why the loop inside my else stops at count = 2? In other words, the loop inside my loop doesn't seem to loop. If I can get my loop to work, hopefully this should print [2, 3]. Any insight is appreciated!
Assuming that my_list2 (not a very nice name for a list) is supposed to contain only the primes from my_list, you need to change your logic a little bit. At the moment, 9 is being added to the list because 9 % 2 != 0. Then 9 % 3 is tested and the loop breaks but 9 has already been added to the list.
You need to ensure that each number has no factors before adding it to the list.
There are much neater ways to do this but they involve things that you may potentially find confusing if you're new to python. This way is pretty close to your original attempt. Note that I've changed your variable names! I have also made use of the x that you are passing to get_prime_factors (in your question you were passing it to the function but not using it). Instead of using the global my_list I have called the function get_factors from within get_prime_factors. Alternatively you could pass in a list - I have shown the changes this would require in comments.
def get_factors(x):
count = 1
my_list = []
while count > 0 and count < x:
if x % count == 0:
my_list.append(x/count)
count += 1
return my_list
# Passing in the number # Passing in a list instead
def get_prime_factors(x): # get_prime_factors(factors):
prime_factors = []
for number in get_factors(x): # for number in factors:
if number <= 2:
prime_factors.append(number)
else:
count = 2
prime = True
while count < number:
if number % count == 0:
prime = False
count += 1
if prime:
prime_factors.append(number)
return prime_factors
print get_prime_factors(18)
output:
[3, 2]
Just to give you a taste of some of the more advanced ways you could go about doing this, get_prime_factors could be reduced to something like this:
def get_prime_factors(x):
prime_factors = []
for n in get_factors(x):
if n <= 2 or all(n % count != 0 for count in xrange(2, n)):
prime_factors.append(n)
return prime_factors
all is a built-in function which would be very useful here. It returns true if everything it iterates through is true. xrange (range on python 3) allows you to iterate through a list of values without manually specifying a counter. You could go further than this too:
def get_prime_factors(x):
return [n for n in get_factors(x) if n <= 2 or all(n % c != 0 for c in xrange(2, n))]

Categories