I have a small script that calculates something. It uses a primitive brute force algorithm and is inherently slow. I expect it to take about 30 minutes to complete. The script only has one print statement at the end when it is done. I would like to have something o make sure the script is still running. I do no want to include prints statements for each iteration of the loop, that seems unnecessary. How can I make sure a script that takes very long to execute is still running at a given time during the script execution. I do not want to cause my script to slow down because of this though. This is my script.
def triangle_numbers(num):
numbers = []
for item in range(1, num):
if num % item == 0:
numbers.append(item)
numbers.append(num)
return numbers
count = 1
numbers = []
while True:
if len(numbers) == 501:
print number
print count
break
numbers = triangle_numbers(count)
count += 1
You could print every 500 loops (or choose another number).
while True:
if len(numbers) == 501:
print number
print count
break
numbers = triangle_numbers(count)
count += 1
# print every 500 loops
if count % 500 == 0:
print count
This will let you know not only if it is running (which it obviously is unless it has finished), but how fast it is going (which I think might be more helpful to you).
FYI:
I expect your program will take more like 30 weeks than 30 minutes to compute. Try this:
'''
1. We only need to test for factors up to the square root of num.
2. Unless we are at the end, we only care about the number of numbers,
not storing them in a list.
3. xrange is better than range in this case.
4. Since 501 is odd, the number must be a perfect square.
'''
def divisors_count(sqrt):
num = sqrt * sqrt
return sum(2 for item in xrange(1, sqrt) if num % item == 0) + 1
def divisors(sqrt):
num = sqrt * sqrt
for item in xrange(1, sqrt):
if num % item == 0:
numbers.append(item)
numbers.append(item / sqrt)
numbers.append(sqrt)
return sorted(numbers)
sqrt = 1
while divisors_count(sqrt) != 501:
if sqrt % 500 == 0:
print sqrt * sqrt
sqrt += 1
print triangle_numbers(sqrt)
print sqrt * sqrt
though I suspect this will still take a long time. (In fact, I'm not convinced it will terminate.)
configure some external tool like supervisor
Supervisor starts its subprocesses via fork/exec and subprocesses don’t daemonize. The operating system signals Supervisor immediately when a process terminates, unlike some solutions that rely on troublesome PID files and periodic polling to restart failed processes.
Related
I'm new to both Python and StackOverflow so I apologise if this question has been repeated too much or if it's not a good question. I'm doing a beginner's Python course and one of the tasks I have to do is to make a function that finds the next prime number after a given input. This is what I have so far:
def nextPrime(n):
num = n + 1
for i in range(1, 500):
for j in range(2, num):
if num%j == 0:
num = num + 1
return num
When I run it on the site's IDE, it's fine and everything works well but then when I submit the task, it says the runtime was too long and that I should optimise my code. But I'm not really sure how to do this, so would it be possible to get some feedback or any suggestions on how to make it run faster?
When your function finds the answer, it will continue checking the same number hundreds of times. This is why it is taking so long. Also, when you increase num, you should break out of the nested loop to that the new number is checked against the small factors first (which is more likely to eliminate it and would accelerate progress).
To make this simpler and more efficient, you should break down your problem in areas of concern. Checking if a number is prime or not should be implemented in its own separate function. This will make the code of your nextPrime() function much simpler:
def nextPrime(n):
n += 1
while not isPrime(n): n += 1
return n
Now you only need to implement an efficient isPrime() function:
def isPrime(x):
p,inc = 2,1
while p*p <= x:
if x % p == 0: return False
p,inc = p+inc,2
return x > 1
Looping from 1 to 500, especially because another loop runs through it, is not only inefficient, but also confines the range of the possible "next prime number" that you're trying to find. Therefore, you should make use of while loop and break which can be used to break out of the loop whenever you have found the prime number (of course, if it's stated that the number is less than 501 in the prompt, your approach totally makes sense).
Furthermore, you can make use of the fact that you only need check the integers less than or equal to the square root of the designated integer (which in python, is represented as num**0.5) to determine if that integer is prime, as the divisors of the integers always come in pair and the largest of the smaller divisor is always a square root, if it exists.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have recently started using python and to try learn I have set a task of being able to run two chunks of code at once.
I have 1 chunk of code to generate and append prime numbers into a list
primes=[]
for num in range(1,999999999999 + 1):
if num > 1:
for i in range(2,num):
if (num % i) == 0:
break
else:
primes.append(num)
And another chunk of code to use the prime numbers generates to find perfect numbers
limit = 25000000000000000000
for p in primes():
pp = 2**p
perfect = (pp - 1) * (pp // 2)
if perfect > limit:
break
elif is_prime(pp - 1):
print(perfect)
I have heard of something to do with importing thread or something along those lines but I am very confused by it, if anyone can help by giving me clear instructions on what to do that would be very appreciated. I have only been learning python for about a week now.
Final note, I didn't code these calculations myself but I have modified them to what I need them for
You can use the multiprocessing library to accomplish this. The basic idea is to have two sets of processes. The first process can fill up a queue with primes, and then you can delegate other processes to deal with those primes and print your perfect numbers.
I did make some changes and implemented a basic is_prime function. (Note that for this implementation you only need to check until the square root of the number). There are better methods but that's not what this question is about.
Anyways, our append_primes function is the same as your first loop, except instead of appending a prime to a list, it puts a prime into a queue. We need some sort of signal to say that we're done appending primes, which is why we have q.put("DONE") at the end. The "DONE" is arbitrary and can be any kind of signal you want, as long as you handle it appropriately.
Then, the perfect_number is kind of like your second loop. It accepts a single prime and prints out a perfect number, if it exists. You may want to return it instead, but that depends on your requirements.
Finally, all of the logic that runs and performs the multiprocessing has to sit inside an if __name__ == "__main__" block to avoid being re-run over and over as the file is pickled and sent to the new process. We initialize our queue and create/start the process to append primes to this queue.
While that's running, we create our own version of a multiprocessing pool. Standard mp pools don't play along with queues, so we have to get a little fancy. We initialize the maximum number of processes we want to run and set it to the current cpu count minus 1 (since 1 will be running the append_primes function.
We loop over q until "DONE" is returned (remember, that's our signal from append_primes). We'll continuously loop over the process pool until we find an available process. Once that happens, we create and start the process, then move on to the next number.
Finally, we do some cleanup and make sure everything in processes is done by calling Process.join() which blocks until the process is done executing. We also ensure prime_finder has finished.
import multiprocessing as mp
import os
import queue
import time
def is_prime(n):
""" Returns True if n is prime """
for i in range(2, int(n**0.5)):
if n%i == 0:
return False
return True
def append_primes(max, q):
""" Searches for primes between 2 and max and adds them to the Queue (q) """
pid = os.getpid()
for num in range(2, int(max)+1):
if is_prime(num):
print(f"{pid} :: Put {num} in queue.")
q.put(num)
q.put("DONE") # A signal to stop processing
return
def perfect_number(prime, limit = 25000000000000000000):
""" Prints the perfect number, if it exists, given the prime """
pp = 2**prime
perfect = (pp - 1) * (pp // 2)
if perfect > limit:
return
if is_prime(pp - 1):
print(f"{os.getpid()} :: Perfect: {perfect}", flush = True)
return
if __name__ == "__main__":
q = mp.Queue()
max = 1000 # When to stop looking for primes
prime_finder = mp.Process(target = append_primes, args = (max, q,))
prime_finder.start()
n_processes = os.cpu_count() - 1 # -1 because 1 is for prime_finder
processes = [None]*n_processes
for prime in iter(q.get, "DONE"):
proc_started = False
while not proc_started: # Check each process till we find an 'available' one.
for m, proc in enumerate(processes):
if proc is None or not proc.is_alive():
processes[m] = mp.Process(target = perfect_number, args = (prime, ))
processes[m].start()
proc_started = True # Get us out of the while loop
break # and out of the for loop.
for proc in processes:
if proc is None: # In case max < n_processes
continue
proc.join()
prime_finder.join()
Comment out the print statement in append_primes if you only want to see the perfect number. The number that appears before is the process' ID (just so that you can see there are multiple processes working at the same time)
Why do 2 for loops at once when you can just put the logic of the second loop inside the first loop: Just instead of the break in the perfects loop use a bool to determine if you've reached the limit.
Also you don't need to check if num > 1. Just start the range at 2
primes=[]
limit = 25_000_000_000_000_000_000
reached_limit = False
def is_prime(n):
return 2 in [n,2**n%n]
for num in range(2, 1_000_000_000_000):
for i in range(2,num):
if (num % i) == 0:
break
else:
primes.append(num)
if not reached_limit:
pp = 2 ** num
perfect = (pp - 1) * (pp // 2)
if perfect > limit:
reached_limit = True
elif is_prime(pp-1):
print(perfect)
I currently have ↓ set as my randprime(p,q) function. Is there any way to condense this, via something like a genexp or listcomp? Here's my function:
n = randint(p, q)
while not isPrime(n):
n = randint(p, q)
It's better to just generate the list of primes, and then choose from that line.
As is, with your code there is the slim chance that it will hit an infinite loop, either if there are no primes in the interval or if randint always picks a non-prime then the while loop will never end.
So this is probably shorter and less troublesome:
import random
primes = [i for i in range(p,q) if isPrime(i)]
n = random.choice(primes)
The other advantage of this is there is no chance of deadlock if there are no primes in the interval. As stated this can be slow depending on the range, so it would be quicker if you cached the primes ahead of time:
# initialising primes
minPrime = 0
maxPrime = 1000
cached_primes = [i for i in range(minPrime,maxPrime) if isPrime(i)]
#elsewhere in the code
import random
n = random.choice([i for i in cached_primes if p<i<q])
Again, further optimisations are possible, but are very much dependant on your actual code... and you know what they say about premature optimisations.
Here is a script written in python to generate n random prime integers between tow given integers:
import numpy as np
def getRandomPrimeInteger(bounds):
for i in range(bounds.__len__()-1):
if bounds[i + 1] > bounds[i]:
x = bounds[i] + np.random.randint(bounds[i+1]-bounds[i])
if isPrime(x):
return x
else:
if isPrime(bounds[i]):
return bounds[i]
if isPrime(bounds[i + 1]):
return bounds[i + 1]
newBounds = [0 for i in range(2*bounds.__len__() - 1)]
newBounds[0] = bounds[0]
for i in range(1, bounds.__len__()):
newBounds[2*i-1] = int((bounds[i-1] + bounds[i])/2)
newBounds[2*i] = bounds[i]
return getRandomPrimeInteger(newBounds)
def isPrime(x):
count = 0
for i in range(int(x/2)):
if x % (i+1) == 0:
count = count+1
return count == 1
#ex: get 50 random prime integers between 100 and 10000:
bounds = [100, 10000]
for i in range(50):
x = getRandomPrimeInteger(bounds)
print(x)
So it would be great if you could use an iterator to give the integers from p to q in random order (without replacement). I haven't been able to find a way to do that. The following will give random integers in that range and will skip anything that it's tested already.
import random
fail = False
tested = set([])
n = random.randint(p,q)
while not isPrime(n):
tested.add(n)
if len(tested) == p-q+1:
fail = True
break
while n in s:
n = random.randint(p,q)
if fail:
print 'I failed'
else:
print n, ' is prime'
The big advantage of this is that if say the range you're testing is just (14,15), your code would run forever. This code is guaranteed to produce an answer if such a prime exists, and tell you there isn't one if such a prime does not exist. You can obviously make this more compact, but I'm trying to show the logic.
next(i for i in itertools.imap(lambda x: random.randint(p,q)|1,itertools.count()) if isPrime(i))
This starts with itertools.count() - this gives an infinite set.
Each number is mapped to a new random number in the range, by itertools.imap(). imap is like map, but returns an iterator, rather than a list - we don't want to generate a list of inifinite random numbers!
Then, the first matching number is found, and returned.
Works efficiently, even if p and q are very far apart - e.g. 1 and 10**30, which generating a full list won't do!
By the way, this is not more efficient than your code above, and is a lot more difficult to understand at a glance - please have some consideration for the next programmer to have to read your code, and just do it as you did above. That programmer might be you in six months, when you've forgotten what this code was supposed to do!
P.S - in practice, you might want to replace count() with xrange (NOT range!) e.g. xrange((p-q)**1.5+20) to do no more than that number of attempts (balanced between limited tests for small ranges and large ranges, and has no more than 1/2% chance of failing if it could succeed), otherwise, as was suggested in another post, you might loop forever.
PPS - improvement: replaced random.randint(p,q) with random.randint(p,q)|1 - this makes the code twice as efficient, but eliminates the possibility that the result will be 2.
I have this python code to generate prime numbers. I added a little piece of code (between # Start progress code and # End progress code) to display the progress of the operation but it slowed down the operation.
#!/usr/bin/python
a = input("Enter a number: ")
f = open('data.log', 'w')
for x in range (2, a):
p = 1
# Start progress code
s = (float(x)/float(a))*100
print '\rProcessing ' + str(s) + '%',
# End progress code
for i in range(2, x-1):
c = x % i
if c == 0:
p = 0
break
if p != 0:
f.write(str(x) + ", ")
print '\rData written to \'data.log\'. Press Enter to exit...'
raw_input()
My question is how to show the progress of the operation without slowing down the actual code/loop. Thanks in advance ;-)
To answer your question I/O is very expensive, and so printing out your progress will have a huge impact on performance. I would avoid printing if possible.
If you are concerned about speed, there is a very nice optimization you can use to greatly speed up your code.
For you inner for loop, instead of
for i in range(2, x-1):
c = x % i
if c == 0:
p = 0
break
use
for i in range(2, x-1**(1.0/2)):
c = x % i
if c == 0:
p = 0
break
You only need to iterate from the range of 2 to the square root of the number you are primality testing.
You can use this optimization to offset any performance loss from printing your progress.
Your inner loop is O(n) time. If you're experiencing lag toward huge numbers then it's pretty normal. Also you're converting x and a into float while performing division; as they get bigger, it could slow down your process.
First, I hope this is a toy problem, because (on quick glance) it looks like the whole operation is O(n^2).
You probably want to put this at the top:
from __future__ import division # Make floating point division the default and enable the "//" integer division operator.
Typically for huge loops where each iteration is inexpensive, progress isn't output every iteration because it would take too long (as you say you are experiencing). Try outputting progress either a fixed number of times or with a fixed duration between outputs:
N_OUTPUTS = 100
OUTPUT_EVERY = (a-2) // 5
...
# Start progress code
if a % OUTPUT_EVERY == 0:
print '\rProcessing {}%'.format(x/a),
# End progress code
Or if you want to go by time instead:
UPDATE_DT = 0.5
import time
t = time.time()
...
# Start progress code
if time.time() - t > UPDATE_DT:
print '\rProcessing {}%'.format(x/a),
t = time.time()
# End progress code
That's going to be a little more expensive, but will guarantee that even as the inner loop slows down, you won't be left in the dark for more than one iteration or 0.5 seconds, whichever takes longer.
So I've been messing around with python's multiprocessing lib for the last few days and I really like the processing pool. It's easy to implement and I can visualize a lot of uses. I've done a couple of projects I've heard about before to familiarize myself with it and recently finished a program that brute forces games of hangman.
Anywho, I was doing an execution time compairison of summing all the prime numbers between 1 million and 2 million both single threaded and through a processing pool. Now, for the hangman cruncher, putting the games in a processing pool improved execution time by about 8 times (i7 with 8 cores), but when grinding out these primes, it actually increased processing time by almost a factor of 4.
Can anyone tell me why this is? Here is the code for anyone interested in looking at or testing it:
#!/user/bin/python.exe
import math
from multiprocessing import Pool
global primes
primes = []
def log(result):
global primes
if result:
primes.append(result[1])
def isPrime( n ):
if n < 2:
return False
if n == 2:
return True, n
max = int(math.ceil(math.sqrt(n)))
i = 2
while i <= max:
if n % i == 0:
return False
i += 1
return True, n
def main():
global primes
#pool = Pool()
for i in range(1000000, 2000000):
#pool.apply_async(isPrime,(i,), callback = log)
temp = isPrime(i)
log(temp)
#pool.close()
#pool.join()
print sum(primes)
return
if __name__ == "__main__":
main()
It'll currently run in a single thread, to run through the processing pool, uncomment the pool statements and comment out the other lines in the main for loop.
the most efficient way to use multiprocessing is to divide the work into n equal sized chunks, with n the size of the pool, which should be approximately the number of cores on your system. The reason for this is that the work of starting subprocesses and communicating between them is quite large. If the size of the work is small compared to the number of work chunks, then the overhead of IPC becomes significant.
In your case, you're asking multiprocessing to process each prime individually. A better way to deal with the problem is to pass each worker a range of values, (probably just a start and end value) and have it return all of the primes in that range it found.
In the case of identifying large-ish primes, the work done grows with the starting value, and so You probably don't want to divide the total range into exactly n chunks, but rather n*k equal chunks, with k some reasonable, small number, say 10 - 100. that way, when some workers finish before others, there's more work left to do and it can be balanced efficiently across all workers.
Edit: Here's an improved example to show what that solution might look like. I've changed as little as possible so you can compare apples to apples.
#!/user/bin/python.exe
import math
from multiprocessing import Pool
global primes
primes = set()
def log(result):
global primes
if result:
# since the result is a batch of primes, we have to use
# update instead of add (or for a list, extend instead of append)
primes.update(result)
def isPrime( n ):
if n < 2:
return False
if n == 2:
return True, n
max = int(math.ceil(math.sqrt(n)))
i = 2
while i <= max:
if n % i == 0:
return False
i += 1
return True, n
def isPrimeWorker(start, stop):
"""
find a batch of primes
"""
primes = set()
for i in xrange(start, stop):
if isPrime(i):
primes.add(i)
return primes
def main():
global primes
pool = Pool()
# pick an arbitrary chunk size, this will give us 100 different
# chunks, but another value might be optimal
step = 10000
# use xrange instead of range, we don't actually need a list, just
# the values in that range.
for i in xrange(1000000, 2000000, step):
# call the *worker* function with start and stop values.
pool.apply_async(isPrimeWorker,(i, i+step,), callback = log)
pool.close()
pool.join()
print sum(primes)
return
if __name__ == "__main__":
main()