Unexpected behavior when using python thread locks and circulair imports - python

I wrote a simple test program using thread locks. This program does not behave as expected, and the python interpreter does not complain.
test1.py:
from __future__ import with_statement
from threading import Thread, RLock
import time
import test2
lock = RLock()
class Test1(object):
def __init__(self):
print("Start Test1")
self.test2 = test2.Test2()
self.__Thread = Thread(target=self.myThread, name="thread")
self.__Thread.daemon = True
self.__Thread.start()
self.test1Method()
def test1Method(self):
print("start test1Method")
with lock:
print("entered test1Method")
time.sleep(5)
print("end test1Method")
def myThread(self):
self.test2.test2Method()
if __name__ == "__main__":
client = Test1()
raw_input()
test2.py:
from __future__ import with_statement
import time
import test1
lock = test1.lock
class Test2(object):
def __init__(self):
print("Start Test2")
def test2Method(self):
print("start test2Method")
with lock:
print("entered test2Method")
time.sleep(5)
print("end test2Method")
Both sleeps are executed at the same time! Not what I expected when using the lock.
When test2Method is moved to test1.py everything works fine. When I create the lock in test2.py and import it in test1.py everything works fine. When I create the lock in a separate source file and import it both in test1.py and test2.py everything works fine.
Probably it has to do with circulair imports.
But why doesn't python complain about this?

In Python when you execute a python script using $ python test1.py what happen is that your test1.py will be imported as __main__ instead of test1, so if you want to get the lock defined in the launched script you shouldn't import test1 but you should import __main__ because if you do the first one you will create another lock that is different from the __main__.lock (test1.lock != __main__.lock).
So one fix to your problem (which far from being the best) and to see what is happening you can change your 2 script to this:
test1.py:
from __future__ import with_statement
from threading import Thread, RLock
import time
lock = RLock()
class Test1(object):
def __init__(self):
print("Start Test1")
import test2 # <<<<<<<<<<<<<<<<<<<<<<<< Import is done here to be able to refer to __main__.lock.
self.test2 = test2.Test2()
self.__Thread = Thread(target=self.myThread, name="thread")
self.__Thread.daemon = True
self.__Thread.start()
self.test1Method()
def test1Method(self):
print("start test1Method")
with lock:
print("entered test1Method")
time.sleep(5)
print("end test1Method")
def myThread(self):
self.test2.test2Method()
if __name__ == "__main__":
client = Test1()
raw_input()
test2.py:
from __future__ import with_statement
import time
# <<<<<<<<<<<<<<<<<<<<<<<<<<<<< test1 is changed to __main__ to get the same lock as the one used in the launched script.
import __main__
lock = __main__.lock
class Test2(object):
def __init__(self):
print("Start Test2")
def test2Method(self):
print("start test2Method")
with lock:
print("entered test2Method")
time.sleep(5)
print("end test2Method")
HTH,

Using print statements before and after the import statements and printing id(lock) right after it's been created reveals that there are in fact two locks being created. It seems the module is imported twice, and mouad explains in his answer that this is because test1.py is imported first as __main__ and then as test1, which causes the lock to be instantiated twice.
Be that as it may, using a global lock is not a good solution anyway. There are several better solutions, and I think you'll find one of them will suit your needs.
Instantiate the lock as a class variable of Test1, and pass it as an argument to Test2
Instantiate the lock as a normal variable of Test1 in __init__, and pass it as an argument to Test2.
Instantiate the lock in the if __name__ == "__main__" block and pass it to Test1, and then from Test1 to Test2.
Instantiate the lock in the if __name__ == "__main__" block and first instantiate Test2 with the lock, then pass the Test2 instance and the lock to Test1. (This is the most decoupled way of doing it, and I'd recommend going with this method. It will ease unit testing, at the very least.).
Here's the code for the last suggestion:
test1.py:
class Test1(object):
def __init__(self, lock, test2):
print("Start Test1")
self.lock = lock
self.test2 = test2
self.__Thread = Thread(target=self.myThread, name="thread")
self.__Thread.daemon = True
self.__Thread.start()
self.test1Method()
def test1Method(self):
print("start test1Method")
with self.lock:
print("entered test1Method")
time.sleep(1)
print("end test1Method")
def myThread(self):
self.test2.test2Method()
if __name__ == "__main__":
lock = RLock()
test2 = test2.Test2(lock)
client = Test1(lock, test2)
test2.py:
class Test2(object):
def __init__(self, lock):
self.lock = lock
print("Start Test2")
def test2Method(self):
print("start test2Method")
with self.lock:
print("entered test2Method")
time.sleep(1)
print("end test2Method")

As others said, the problem is not in the threading, but in your special case of cyclic imports.
Why special? Because usual workflow (mod1 imports mod2 and mod2 imports mod1) looks like the next:
You want to use module mod1, you imports it (import mod1)
When Python finds it, interpreter adds it to sys.modules and starts code execution
When it reaches line with import mod2, it stops execution of mod1 and starts execution of mod2
When interpreter reaches import mod1 within mod2, it doesn't load mod1 because it was already added to sys.modules
After that (unless some code in mod2 accesses some uninitialized resource from mod1) interpreter finishes execution of mod2 and mod1.
But in your case at step 4. Python executes test1 one more time because there is no test1 in sys.modules! The reason for this is that you didn't imported it in the first place, but run it from a command line.
So, just don't use cyclic imports - as you see it is a real mess.

Related

Why doesn't the second python function run? Infinite loop

I have to make two separate functions, one that prints 'Ha' after every 1 second, and another one that prints 'Ho' after every 2 seconds.
But b() is greyed out, and it won't run.
import time
def a():
while True:
time.sleep(1)
print('Ha')
def b():
while True:
time.sleep(2)
print('Ho')
a()
b()
Why is the second function not running?
Edit: I have to have 2 separate functions that can both run infinitely.
The b() function is never called because a() never returns. Here's a simple approach that achieves what you are looking for:
import time
count = 0
while True:
time.sleep(1)
print("Ha")
if count % 2 :
print("Ho")
count += 1
If you made your program multi-threaded, you could run a() in one thread and b() in another thread and your approach would work.
Here's a version of your program with multi-thread - which allows both your a() and b() functions to run somewhat simultaneously:
import time
from threading import Thread
class ThreadA(Thread):
def a():
while True:
time.sleep(1)
print('Ha')
def run(self):
ThreadA.a()
class ThreadB(Thread):
def b():
while True:
time.sleep(2)
print('Ho')
def run(self):
ThreadB.b()
ThreadA().start()
ThreadB().start()
Edit: Here's a simpler multi-threaded version which allows you to specify the function to execute when you start the thread (I probably should look through the threading module for this feature - it seems like it should be in there):
import time
from threading import Thread
class ThreadAB(Thread):
def run(self):
self.func()
def start(self,func):
self.func = func
super().start()
def a():
while True:
time.sleep(1)
print('Ha')
def b():
while True:
time.sleep(2)
print('Ho')
ThreadAB().start(a)
ThreadAB().start(b)
Here's an absolutely horrible solution that runs both functions in different processes without threads or classes:
#!/usr/bin/env python
import time
import sys
import os
def a():
while True:
time.sleep(1)
print('Ha')
def b():
while True:
time.sleep(2)
print('Ho')
if len(sys.argv) > 1:
eval( sys.argv[1] )
else:
os.system(f"{sys.argv[0]} 'a()' &")
os.system(f"{sys.argv[0]} 'b()' &")
For the above solution to work, I made my program executable and ran it from the command line like this:
command
The results were somewhat awful. I kicked off two programs running at the same time in the background. One of the programs printed Ha and the other Ho. They both were running in the background so I had to use the following command to kill them:
ps -ef | grep command | awk '{print $2}' | xargs kill
Edit: And finally, here's an asyncio approach (my first time writing something like this):
import asyncio
async def a():
while True:
print('Ha(1)')
await asyncio.sleep(1)
async def b():
while True:
print('Ho(2)')
await asyncio.sleep(2)
async def main():
taskA = loop.create_task (a())
taskB = loop.create_task(b())
await asyncio.wait([taskA,taskB])
if __name__ == "__main__":
try:
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
except :
pass
Credit for the above code: https://www.velotio.com/engineering-blog/async-features-in-python

Threading Python3

I am trying to use Threading in Python, and struggle to kick off two functions at the same time, then wait for both to finish and load returned data into variables in the main code. How can this be achieved?
import threading
from threading import Thread
func1():
#<do something>
return(x,y,z)
func2():
#<do something>
return(a,b,c)
Thread(target=func1).start()
Thread(target=func2).start()
#<hold until both threads are done, load returned values>
More clarity is definitely required from the question asked. Perhaps you're after something like the below?
import threading
from threading import Thread
def func1():
print("inside func1")
return 5
def func2():
print("inside func2")
return 6
if __name__ == "__main__":
t1 = Thread(target=func1)
t2 = Thread(target=func2)
threads = [t1, t2]
for t in threads:
t.start()
I believe you were missing the start() method to actually launch your threads?

Get live value of variable from another script

I have two scripts, new.py and test.py.
Test.py
import time
while True:
x = "hello"
time.sleep(1)
x = "world"
time.sleep(1)
new.py
import time
while True:
import test
x = test.x
print(x)
time.sleep(1)
Now from my understanding this should print "hello" and a second later "world" all the time when executing new.py.
It does not print anything, how can i fix that?
Thanks
I think the code below captures what you are asking. Here I simulate two scripts running independently (by using threads), then show how you can use shelve to communicate between them. Note, there are likely much better ways to get to what you are after -- but if you absolutely must run the scripts independently, this will work for you.
Incidentally, any persistent source would do (such as a database).
import shelve
import time
import threading
def script1():
while True:
with shelve.open('my_store') as holder3:
if holder3['flag'] is not None: break
print('waiting')
time.sleep(1)
print("Done")
def script2():
print("writing")
with shelve.open('my_store') as holder2:
holder2['flag'] = 1
if __name__ == "__main__":
with shelve.open('my_store') as holder1:
holder1['flag'] = None
t = threading.Thread(target=script1)
t.start()
time.sleep(5)
script2()
t.join()
Yields:
waiting
waiting
waiting
waiting
waiting
writing
Done
Test.py
import time
def hello():
callList = ['hello', 'world']
for item in callList:
print item
time.sleep(1)
hello()
new.py
from parent import hello
while True:
hello()

Using Multiprocessing with Modules

I am writing a module such that in one function I want to use the Pool function from the multiprocessing library in Python 3.6. I have done some research on the problem and the it seems that you cannot use if __name__=="__main__" as the code is not being run from main. I have also noticed that the python pool processes get initialized in my task manager but essentially are stuck.
So for example:
class myClass()
...
lots of different functions here
...
def multiprocessFunc()
do stuff in here
def funcThatCallsMultiprocessFunc()
array=[array of filenames to be called]
if __name__=="__main__":
p = Pool(processes=20)
p.map_async(multiprocessFunc,array)
I tried to remove the if __name__=="__main__" part but still no dice. any help would appreciated.
It seems to me that your have just missed out a self. from your code. I should think this will work:
class myClass():
...
# lots of different functions here
...
def multiprocessFunc(self, file):
# do stuff in here
def funcThatCallsMultiprocessFunc(self):
array = [array of filenames to be called]
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array) #added self. here
Now having done some experiments, I see that map_async could take quite some time to start up (I think because multiprocessing creates processes) and any test code might call funcThatCallsMultiprocessFunc and then quit before the Pool has got started.
In my tests I had to wait for over 10 seconds after funcThatCallsMultiprocessFunc before calls to multiprocessFunc started. But once started, they seemed to run just fine.
This is the actual code I've used:
MyClass.py
from multiprocessing import Pool
import time
import string
class myClass():
def __init__(self):
self.result = None
def multiprocessFunc(self, f):
time.sleep(1)
print(f)
return f
def funcThatCallsMultiprocessFunc(self):
array = [c for c in string.ascii_lowercase]
print(array)
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array, callback=self.done)
p.close()
def done(self, arg):
self.result = 'Done'
print('done', arg)
Run.py
from MyClass import myClass
import time
def main():
c = myClass()
c.funcThatCallsMultiprocessFunc()
for i in range(30):
print(i, c.result)
time.sleep(1)
if __name__=="__main__":
main()
The if __name__=='__main__' construct is an import protection. You want to use it, to stop multiprocessing from running your setup on import.
In your case, you can leave out this protection in the class setup. Be sure to protect the execution points of the class in the calling file like this:
def apply_async_with_callback():
pool = mp.Pool(processes=30)
for i in range(z):
pool.apply_async(parallel_function, args = (i,x,y, ), callback = callback_function)
pool.close()
pool.join()
print "Multiprocessing done!"
if __name__ == '__main__':
apply_async_with_callback()

Control running Python Process (multiprocessing)

I have yet another question about Python multiprocessing.
I have a module that creates a Process and just runs in a while True loop.
This module is meant to be enabled/disabled from another Python module.
That other module will import the first one once and is also run as a process.
How would I better implement this?
so for a reference:
#foo.py
def foo():
while True:
if enabled:
#do something
p = Process(target=foo)
p.start()
and imagine second module to be something like that:
#bar.py
import foo, time
def bar():
while True:
foo.enable()
time.sleep(10)
foo.disable()
Process(target=bar).start()
Constantly running a process checking for condition inside a loop seems like a waste, but I would gladly accept the solution that just lets me set the enabled value from outside.
Ideally I would prefer to be able to terminate and restart the process, again from outside of this module.
From my understanding, I would use a Queue to pass commands to the Process. If it is indeed just that, can someone show me how to set it up in a way that I can add something to the queue from a different module.
Can this even be easily done with Python or is it time to abandon hope and switch to something like C or Java
I purposed in comment two different approches :
using a shared variable from multiprocessing.Value
pause / resume the process with signals
Control by sharing a variable
def target_process_1(run_statement):
while True:
if run_statement.value:
print "I'm running !"
time.sleep(1)
def target_process_2(run_statement):
time.sleep(3)
print "Stoping"
run_statement.value = False
time.sleep(3)
print "Resuming"
run_statement.value = True
if __name__ == "__main__":
run_statement = Value("i", 1)
process_1 = Process(target=target_process_1, args=(run_statement,))
process_2 = Process(target=target_process_2, args=(run_statement,))
process_1.start()
process_2.start()
time.sleep(8)
process_1.terminate()
process_2.terminate()
Control by sending a signal
from multiprocessing import Process
import time
import os, signal
def target_process_1():
while True:
print "Running !"
time.sleep(1)
def target_process_2(target_pid):
time.sleep(3)
os.kill(target_pid, signal.SIGSTOP)
time.sleep(3)
os.kill(target_pid, signal.SIGCONT)
if __name__ == "__main__":
process_1 = Process(target=target_process_1)
process_1.start()
process_2 = Process(target=target_process_2, args=(process_1.pid,))
process_2.start()
time.sleep(8)
process_1.terminate()
process_2.terminate()
Side note: if possible do not run a while True.
EDIT: if you want to manage your process in two different files, supposing you want to use a control by sharing a variable, this is a way to do.
# file foo.py
from multiprocessing import Value, Process
import time
__all__ = ['start', 'stop', 'pause', 'resume']
_statement = None
_process = None
def _target(run_statement):
""" Target of the foo's process """
while True:
if run_statement.value:
print "I'm running !"
time.sleep(1)
def start():
global _process, _statement
_statement = Value("i", 1)
_process = Process(target=_target, args=(_statement,))
_process.start()
def stop():
global _process, _statement
_process.terminate()
_statement, _process = None, _process
def enable():
_statement.value = True
def disable():
_statement.value = False

Categories