Python concurrent.futures trying to import functions - python

So I got 2 .py files and am trying to import the test function from the first to the secon one. But every time I try that I just get a "BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending." Error. I have no idea what Im messing up help is very much appreciated
parallel.py:
import time
from concurrent import futures
def test(t):
time.sleep(t)
print("Ich habe {} Sekunden gewartet. Zeit {:.0f}".format(t, time.time()))
def main():
print("Startzeit: {:.0f}".format(time.time()))
start = time.perf_counter()
with futures.ThreadPoolExecutor(max_workers=3) as ex:
ex.submit(test, 9)
ex.submit(test, 4)
ex.submit(test, 5)
ex.submit(test, 6)
print("Alle Aufgaben gestartet.")
print("Alle Aufgaben erledigt.")
finish = time.perf_counter()
print("Fertig in ",round(finish-start,2)," seconds(s)")
if __name__ == "__main__":
main()
parallel2.py:
import parallel
import time
import concurrent.futures
# =============================================================================
# def test(t):
# time.sleep(t)
# return ("Ich habe {} Sekunden gewartet. Zeit {:.0f}".format(t, time.time()))
# =============================================================================
def main():
print("Startzeit: {:.0f}".format(time.time()))
start = time.perf_counter()
with concurrent.futures.ProcessPoolExecutor() as executor:
f1 = executor.submit(parallel.test, 9)
f2 = executor.submit(parallel.test, 5)
f3 = executor.submit(parallel.test, 4)
f4 = executor.submit(parallel.test, 6)
print(f1.result())
print(f2.result())
print(f3.result())
print(f4.result())
finish = time.perf_counter()
print("Fertig in ",round(finish-start,2)," seconds(s)")
if __name__ =="__main__":
main()

Test that solution:
Remove a condition if __name__ == "__main__" from parallel.py.
You put the condition in both scripts: if __name__ == "__main__" to execute the main function.
When doing this your script checks if it is the main module, and executes the function only if the return is true.
When you import another script, your module is no longer "__main__", so the return does not satisfy the condition imposed for the function to run. .

Related

callback doesnot work in pool.map_async()

In the following simple program, the callback passed to pool.map_async() does not seem to work properly. Could someone point out what is wrong?
import os
import multiprocessing
import time
def cube(x):
return "{}^3={}".format(x, x**3)
def prt(value):
print(value)
if __name__ == "__main__":
pool = multiprocessing.Pool(3)
start_time = time.perf_counter()
result = pool.map_async(cube, range(1,1000), callback=prt)
finish_time = time.perf_counter()
print(f"Program finished in {finish_time-start_time} seconds")
$ python3 /var/tmp/cube_map_async_callback.py
Program finished in 0.0001492840237915516 seconds
$

threading returning DataFrame using queue

Probably a simple question, I'm a beginner.
So, I have a script with a function that returns a DataFrame using threading. Its working fine, but I'm having trouble understanding what's going on when I create the thread. The code:
def test(word,word1):
print(f'start {word}')
df_1 = pd.DataFrame({'test_1':[word],'test_2':[word1]})
time.sleep(2)
print(f'end {word}')
return df_1
my_queue = queue.Queue()
if __name__ == "__main__":
t1_start = perf_counter()
x = threading.Thread(target=lambda q, arg1,arg2: q.put(test(arg1,arg2)), args=(my_queue, 'first','ok1'))
x1 = threading.Thread(target=lambda q, arg1,arg2: q.put(test(arg1,arg2)), args=(my_queue, 'second','ok2'))
x1.start()
x.start()
print('\nrun something')
x.join()
x1.join()
t2_start = perf_counter()
print(f'\n time:{t2_start-t1_start}')
as I said, this script works fine, giving me the following return:
But if I try to remove the lambda function as below:
if __name__ == "__main__":
t1_start = perf_counter()
x = threading.Thread(target=my_queue.put(test('first','ok1')))
x1 = threading.Thread(target=my_queue.put(test('second','ok2')))
x1.start()
x.start()
print('\nrun something')
x.join()
x1.join()
t2_start = perf_counter()
print(f'\n time:{t2_start-t1_start}')
The script will still work, but the threading not. I have the following return:
Why do I need to use a lambda function for threading to work?

Let two functions run periodically with different 'sampling' times

I already managed executing one function periodically with a specific sampling time T with the python scheduler from the sched package:
import sched
import time
def cycle(sche, T, fun, arg):
sche.enter(T, 1, cycle, (sche, T, fun, arg))
fun(arg)
def fun(arg):
print(str(time.time()))
print(arg)
def main():
scheduler = sched.scheduler(time.time, time.sleep)
T = 1
arg = "some argument"
cycle(scheduler, T, fun, arg)
scheduler.run()
What I would like to do is adding another function fun2(), that also will be executed periodically with another sample time T2.
What would be a proper way to do that?
So for me the following solution worked:
as I will have two CPU bound tasks I set up a multiprocessing environment with two processes. Each process starts an own scheduler that runs 'forever' with its own 'sampling' time.
What does anybody with more experience in python than me (I've just started :-D) think about this approach? Will it cause any problems in your opinion?
import time
import multiprocessing
import sched
global schedule1
global schedule2
def fun1(arg):
print("Im the function that is executed every T1")
time.sleep(0.05) # do something for t < T1
def fun2(arg):
print("Im the function that is executed every T2")
time.sleep(0.8) # do something for t < T2
def cycle1(scheduler1, T1, fun, arg):
global schedule1
try:
schedule1.append(scheduler1.enter(T1, 1, cycle1, (scheduler1, T1, fun, arg)))
fun1(arg)
scheduler1.run()
except KeyboardInterrupt:
for event in schedule1:
try:
scheduler1.cancel(event)
except ValueError:
continue
return
def cycle2(scheduler2, T2, fun, arg):
global schedule2
try:
schedule2.append(scheduler2.enter(T2, 1, cycle2, (scheduler2, T2, fun, arg)))
fun2(arg)
scheduler2.run()
except KeyboardInterrupt:
for event in schedule2:
try:
scheduler2.cancel(event)
except ValueError:
continue
return
def main():
global schedule2
global schedule1
schedule2 = []
schedule1 = []
scheduler1 = sched.scheduler(time.time, time.sleep)
scheduler2 = sched.scheduler(time.time, time.sleep)
T1 = 0.1
T2 = 1
list_of_arguments_for_fun1 = []
list_of_arguments_for_fun2 = []
processes = []
# set up first process
process1 = multiprocessing.Process(target=cycle1, args=(scheduler1, T1, fun1, list_of_arguments_for_fun1))
processes.append(process1)
# set up second process
process2 = multiprocessing.Process(target=cycle2, args=(scheduler2, T2, list_of_arguments_for_fun2, list_of_arguments_for_fun2))
processes.append(process2)
process1.start()
process2.start()
for process in processes:
process.join()
# anything below here in the main() won't be executed
if __name__ == "__main__":
try:
start = time.perf_counter()
main()
except KeyboardInterrupt:
print('\nCancelled by User. Bye!')
finish = time.perf_counter()
print(f'Finished in {round(finish - start, 2)} second(s)')

How can I measure execution time of a Python program (functional structure)?

I need to measure the execution time of a Python program having the following structure:
import numpy
import pandas
def func1():
code
def func2():
code
if __name__ == '__main__':
func1()
func2()
If I want to use "time.time()", where should I put them in the code? I want to get the execution time for the whole program.
Alternative 1:
import time
start = time.time()
import numpy
import pandas
def func1():
code
def func2():
code
if __name__ == '__main__':
func1()
func2()
end = time.time()
print("The execution time is", end - start)
Alternative 2:
import numpy
import pandas
def func1():
code
def func2():
code
if __name__ == '__main__':
import time
start = time.time()
func1()
func2()
end = time.time()
print("The execution time is", end - start)
In linux: you could run this file test.py using the time command
time python3 test.py
After your program runs it will give you the following output:
real 0m0.074s
user 0m0.004s
sys 0m0.000s
this link will tell the difference between the three times you get
The whole program:
import time
t1 = time.time()
import numpy
import pandas
def func1():
code
def func2():
code
if __name__ == '__main__':
func1()
func2()
t2 = time.time()
print("The execution time is", t2 - t1)

timing a python program with threads

I have the following block of code that is part of a larger program. I am trying to get it to print the execution time once all of the threads are closed but can't seem to get it to work. Any ideas?
import time
import csv
import threading
import urllib.request
def openSP500file():
SP500 = reader(open(r'C:\Users\test\Desktop\SP500.csv', 'r'), delimiter=',')
for x in SP500:
indStk = x[0]
t1 = StockData(indStk)
t1.start()
if not t1.isAlive():
print(time.clock()-start_time, 'seconds')
else:
pass
def main():
openSP500file()
if __name__ == '__main__':
start_time = time.clock()
main()
Thanks!
You aren't waiting for all the threads to finish (only the last one created). Perhaps something like this in your thread-spawning loop?
threads = []
for x in SP500:
t1 = StockData(x[0])
t1.start()
threads.append(t1)
for t in threads:
t.join()
... print running time

Categories