I am currently having issues to do with my understanding of threading or possibly my understanding of how variables are passed/assigned thru threads in python. I have this simple program that takes in a list of current stocks that are displayed on a screen and grabs the stock information related to those. I am using threads so that I can constantly update the screen and constantly collect data. I am having two issues:
Inside dataCollector_thread() i understand that if i append to the stocksOnScreenListInfo that the variable (stocksOnScreenListInfo) inside main is updated.
However I don't want to append to the list but rather just reassign the list like the following but this does not work?.
def dataCollector_thread(stocksOnScreenListInfo, stocksOnScreen):
while(True):
placeholder = []
for stock in stocksOnScreen:
placeholer.append(RetrieveQuote(stock))
stocksOnScreenListInfo = placeholder
time.sleep(5)
Inside screenUpdate_thread i am wanting to update stocksOnScreen to the variable 'TSLA' defined by the function UpdateScreen. This does not seem to update its corresponding stocksOnScreen in main as when I print to check it continues to say 'AAPL'?
def main(args):
stocksOnScreen = ["AAPL"] # List of the stocks currently displayed on LED screen
stocksOnScreenListInfo = [] # The quote information list for each stock on screen
thread_data_collector = threading.Thread(target=dataCollector_thread, args=(stocksOnScreenListInfo,stocksOnScreen))
thread_data_collector.daemon = True
thread_data_collector.start()
thread_screen = threading.Thread(target=screenUpdate_thread, args=(stocksSearchArray,stocksOnScreen))
thread_screen.daemon = True
thread_screen.start()
def dataCollector_thread(stocksOnScreenListInfo, stocksOnScreen):
while(True):
for stock in stocksOnScreen:
stocksOnScreenListInfo.append(RetrieveQuote(stock))
time.sleep(5)
def screenUpdate_thread(stocksSearchArray, stocksOnScreen):
while(True):
stocksOnScreen = UpdateScreen(stocksSearchArray)
def UpdateScreen(stocksSearchArray):
pass
return ["TSLA"]
There are a couple of issues with this function:
def dataCollector_thread(stocksOnScreenListInfo, stocksOnScreen):
while(True):
placeholder = []
for stock in stocksOnScreen:
placeholer.append(RetrieveQuote(stock))
stocksOnScreenListInfo = placeholder
time.sleep(5)
you're assigning stocksOnScreenListInfo within this function to a new list placeholder. What you want to do is modify the contents in-place so that stocksOnScreenListInfo in main is updated. You can do that like this: stocksOnScreenListInfo[:] = placeholder (which means change contents from beginning to end with the new list).
stocksOnScreen could change while you're iterating it in the for loop since you're updating it in another thread. You should do this atomically. A lock
(that you pass as a parameter to the function) will help here: it's a synchronisation primitive that is designed to prevent data races when multiple threads share data and at least one of them modifies it.
I can't see stocksOnScreenListInfo being used anywhere else in your code. Is it used in another function? If so, you should think about having a lock around that.
I would modify the function like this:
def dataCollector_thread(stocksOnScreenListInfo, stocksOnScreen, lock):
while True:
placeholder = []
with lock: # use lock to ensure you atomically access stocksOnScreen
for stock in stocksOnScreen:
placeholder.append(RetrieveQuote(stock))
stocksOnScreenListInfo[:] = placeholder # modify contents of stocksOnScreenListInfo
time.sleep(5)
In your other thread function:
def screenUpdate_thread(stocksSearchArray, stocksOnScreen):
while(True):
stocksOnScreen = UpdateScreen(stocksSearchArray)
you're assigning stocksOnScreen to a new list within this function; it won't affect stocksOnScreen in main. Again you can do that using the notation stocksOnScreen[:] = new_list. I would lock before before updating stocksOnScreen to ensure your other thread function dataCollector_thread accesses stocksOnScreen atomically like so:
def screenUpdate_thread(stocksSearchArray, stocksOnScreen, lock):
while True:
updated_list = UpdateScreen() # build new list - doesn't have to be atomic
with lock:
stocksOnScreen[:] = updated_list # update contents of stocksOnScreen
time.sleep(0.001)
As you can see I put in a small sleep, otherwise the function will loop constantly and be too CPU-intensive. Plus it will give Python a chance to context switch between your thread functions.
Finally, in main create a lock:
lock = threading.Lock()
and pass lock to both functions as a parameter.
stocksOnScreen = ... changes the reference itself. Since the reference is passed to the function/thread as a parameter, the change is done to the copy of the original reference within the function/thread. (Both function/thread have their own copy.)
So instead you should manipulate the list object it refers to (e.g. list.clear() and list.extend()).
However, as you can see, it's now no longer an atomic action. So there are chances that dataCollector_thread would be working on an empty list (i.e. do nothing) and sleep 5 seconds. I provided a possible workaround/solution below as well. Not sure if it is supposed to work (perfectly) though:
def dataCollector_thread(stocksOnScreen):
while(True):
sos_copy = stocksOnScreen.copy() # *might* avoid the race?
for stock in sos_copy:
print(stock)
if (len(sos_copy) > 0): # *might* avoid the race?
time.sleep(5)
def UpdateScreen():
return ["TSLA"]
def screenUpdate_thread(stocksOnScreen):
while(True):
# manipulate the list object instead of changing the reference (copy) in the function
stocksOnScreen.clear()
# race condition: dataCollector_thread might work on an empty list and sleep 5 seconds
stocksOnScreen.extend(UpdateScreen())
def main():
stocksOnScreen = ["AAPL"] # List of the stocks currently displayed on LED screen
thread_data_collector = threading.Thread(target=dataCollector_thread, args=(stocksOnScreen,)) # note the comma
thread_data_collector.daemon = True
thread_data_collector.start()
thread_screen = threading.Thread(target=screenUpdate_thread, args=(stocksOnScreen,)) # note the comma
thread_screen.daemon = True
thread_screen.start()
Note: according to this answer, python lists are thread-safe, so the copy workaround should work.
You can probably make use of global instead of passing stocksOnScreen as parameter as well:
def dataCollector_thread():
global stocksOnScreen # superfluous if no re-assignment
while(True):
for stock in stocksOnScreen:
print(stock)
time.sleep(5)
def UpdateScreen():
return ["TSLA"]
def screenUpdate_thread():
global stocksOnScreen # needed for re-assignment
while(True):
stocksOnScreen = UpdateScreen()
def main():
global stocksOnScreen # Or create stocksOnScreen outside main(), which is a function itself
stocksOnScreen = ["AAPL"] # List of the stocks currently displayed on LED screen
thread_data_collector = threading.Thread(target=dataCollector_thread)
thread_data_collector.daemon = True
thread_data_collector.start()
thread_screen = threading.Thread(target=screenUpdate_thread)
thread_screen.daemon = True
thread_screen.start()
Ref.: https://docs.python.org/3/faq/programming.html#what-are-the-rules-for-local-and-global-variables-in-python
You have three options here since, python like java passes parameters by value & not reference.
First, use a global parameter.
def threadFunction():
globalParam = "I've ran"
global globalParam
threading.Thread(target=threadFunction)
Second, an Updater Function
def threadFunction(update):
update("I've ran")
threading.Thread(target=threadFunction, args=((lambda x: print(x)),))
Third, Expose global parameter holder
def threadFunction(param1, param2):
globalParams[0]= param1 + " Just Got access"
global globalParams
globalParams = ["Param1","Param2"]
threading.Thread(target=threadFunction, args=(*globalParams))
I hope this answered your question ;)
Related
I have a function def act(obs) that returns a float and is computationally expensive (takes some time to run).
import time
import random
def act(obs):
time.sleep(5) # mimic computation time
action = random.random()
return action
I regularly keep calling this function in a script faster than how long it takes for it to execute. I do not want any waiting time when calling the function. Rather I prefer using the returned value from an earlier computation. How do I achieve this?
Something I have thought of is having a global variable that updated in the function and I keep reading the global variable although I am not sure if that is the best way to achieve it.
This is what I ended up using based on this answer
class MyClass:
def __init__(self):
self.is_updating = False
self.result = -1
def _act(self, obs):
self.is_updating = True
time.sleep(5)
self.result = obs
self.is_updating = False
def act(self, obs):
if not self.is_updating:
threading.Thread(target=self._act, args=[obs]).start()
return self.result
agent = MyClass()
i = 0
while True:
agent.act(obs=i)
time.sleep(2)
print(i, agent.result)
i += 1
The global variable way should work, Also you can have a class that has a private member let's say result and a flag isComputing and a method getResult which would call a method compute()(via a thread) if it is not computing currently, and returns the previous result. The compute() method should update the flag isComputing properly.
I want to move some functions to an external file for making it clearer.
lets say i have this example code (which does indeed work):
import threading
from time import sleep
testVal = 0
def testFunc():
while True:
global testVal
sleep(1)
testVal = testVal + 1
print(testVal)
t = threading.Thread(target=testFunc, args=())
t.daemon = True
t.start()
try:
while True:
sleep(2)
print('testval = ' + str(testVal))
except KeyboardInterrupt:
pass
now i want to move testFunc() to a new python file. My guess was the following but the global variables don't seem to be the same.
testserver.py:
import threading
import testclient
from time import sleep
testVal = 0
t = threading.Thread(target=testclient.testFunc, args=())
t.daemon = True
t.start()
try:
while True:
sleep(2)
print('testval = ' + str(testVal))
except KeyboardInterrupt:
pass
and testclient.py:
from time import sleep
from testserver import testVal as val
def testFunc():
while True:
global val
sleep(1)
val = val + 1
print(val)
my output is:
1
testval = 0
2
3
testval = 0 (testval didn't change)
...
while it should:
1
testval = 1
2
3
testval = 3
...
any suggestions? Thanks!
Your immediate problem is not due to multithreading (we'll get to that) but due to how you use global variables. The thing is, when you use this:
from testserver import testVal as val
You're essentially doing this:
import testserver
val = testserver.testVal
i.e. you're creating a local reference val that points to the testserver.testVal value. This is all fine and dandy when you read it (the first time at least) but when you try to assign its value in your function with:
val = val + 1
You're actually re-assigning the local (to testclient.py) val variable, not setting the value of testserver.testVal. You have to directly reference the actual pointer (i.e. testserver.testVal += 1) if you want to change its value.
That being said, the next problem you might encounter might stem directly from multithreading - you can encounter a race-condition oddity where GIL pauses one thread right after reading the value, but before actually writing it, and the next thread reading it and overwriting the current value, then the first thread resumes and writes the same value resulting in single increase despite two calls. You need to use some sort of mutex to make sure that all non-atomic operations execute exclusively to one thread if you want to use your data this way. The easiest way to do it is with a Lock that comes with the threading module:
testserver.py:
# ...
testVal = 0
testValLock = threading.Lock()
# ...
testclient.py:
# ...
with testserver.testValLock:
testserver.testVal += 1
# ...
A third and final problem you might encounter is a circular dependency (testserver.py requires testclient.py, which requires testserver.py) and I'd advise you to re-think the way you want to approach this problem. If all you want is a common global store - create it separately from modules that might depend on it. That way you ensure proper loading and initializing order without the danger of unresolveable circular dependencies.
The story begin with two threads and a global variable that change.. a lot of time :)
Thread number one (for simplicity we will call t1) generates a random number and store it in a global variable GLB.
Thread number two (aka t2) check the value of the global variable and when it reaches a value starts to print his value until a period of time.
BUT if t1 changes the value of that global variable, also change the value inside the loop and I don't want this!
I try to write pseudocode:
import random
import time
import threading
GLB = [0,0]
#this is a thread
def t1():
while True:
GLB[0] = random.randint(0, 100)
GLB[1] = 1
print GLB
time.sleep(5)
#this is a thread
def t2():
while True:
if GLB[0]<=30:
static = GLB
for i in range(50):
print i," ",static
time.sleep(1)
a = threading.Thread(target=t1)
a.start()
b = threading.Thread(target=t2)
b.start()
while True:
time.sleep(1)
The question is: why variable static change inside the loop for? It should be remain constant unitl it escapes from loop!
Could I create a lock to the variable? Or there is any other way to solve the problem?
Thanks regards.
GLB is a mutable object. To let one thread see a consistent value while another thread modifies it you can either protect the object temporarily with a lock (the modifier will wait) or copy the object. In your example, a copy seems the best option. In python, a slice copy is atomic so does not need any other locking.
import random
import time
import threading
GLB = [0,0]
#this is a thread
def t1():
while True:
GLB[0] = random.randint(0, 100)
GLB[1] = 1
print GLB
time.sleep(5)
#this is a thread
def t2():
while True:
static = GLB[:]
if static[0]<=30:
for i in range(50):
print i," ",static
time.sleep(1)
a = threading.Thread(target=t1)
a.start()
b = threading.Thread(target=t2)
b.start()
while True:
time.sleep(1)
I'm having trouble when trying to access a global dictionary which holds objects as values.
Within the code, one thread (TestAccess) will listen to web connections and create objects and update its variables, assign a key to it and insert in to the global dictionary (client_list) accordingly. While the other thread (data_cleaner) will go through the list of keys in this global dictionary and check for certain values in each object and will delete objects if it meets certain criteria.
The objects I'm creating (clientObject) attaches another object (deviceObject) when it gets created - just so you know.
When I run both threads, the thread that should check objects (data_cleaner) will not see the dictionary being updated. It always returns {}. If I run the functions without any treads and both returns the correct dictionary values as expected.
I have tried the global keyword but had no luck. Also added a Lock() just to make sure we don't have any simultaneous resource access issues.
Can someone please shed some light on this? Following is the structure of my code.
import web
import json
import threading
import time
urls = (
'/testaccess', "TestAccess"
)
client_list = {}
lock = threading.Lock()
class clientObject(object):
# each created object here will attach another object from from deviceObject below
class deviceObject(object):
# Object items
class TestAccess:
def __init__(self):
pass
def GET(self):
return "abcd"
def POST(self):
raw_data = web.data()
json_dic = json.loads(raw_data)
process_data(json_dic)
def process_data (json_dic)
global lock
global client_list
lock.acquire()
# Perform some processing on the JSON data.
if XXXXXXXXXXXX:
# Create the new object and and update values.
client_list[ID] = clientObject()
client_list[ID].XX[ID].pred_vals(jsonInfo)
else:
# Update the object
print client_list # This prints all key:value pairs nicely as expected.
lock.release()
def data_cleaner()
global lock
global client_list
while True:
lock.acquire()
print client_list # this prints out just "{}"
# Do other things
lock.release()
time.sleep(5)
if __name__ == "__main__":
app = web.application(urls, globals())
def start_web_server():
app.run()
T2 = threading.Thread(target=data_cleaner)
T1 = threading.Thread(target=start_web_server)
T1.daemon = False
T1.start()
T2.start()
With MartijnPieters help I was able to resolve this issue by adding the "autoreloader = False" as a parameter when creating the web object as shown below.
if __name__ == "__main__":
app = web.application(urls, globals(), autoreload=False)
def start_web_server():
app.run()
T2 = threading.Thread(target=data_cleaner)
T1 = threading.Thread(target=start_web_server)
T1.daemon = False
T1.start()
T2.start()
I have the following code in a class:
def __setattr__(self, key, value):
self.__dict__['d'][key] = value
...
self.saveToIni()
The saveToIni function saves all the dict's items to an ini file at every object's setattr call. If 80 setattr calls are made in the last 120ms, then the file will be written from scratch every time. The function also orders and sometimes deletes data from the dictonary, so I don't want to change it.
I want to limit the calls to once in, let's say, 5 seconds:
When the first setattr is triggered, a timer starts asynchronicly, still not running saveToIni.
If any calls are made to setattr and the timer is still counting, it will nor fire a timer nor run saveToIni.
When the timer times out, the saveToIni should launch.
Now, I'm not sure how to achieve this behavoir. I've thought about messing with threads, but still didn't found the idea about how to do it.
The way I would go, is to create a Timer Thread which runs a timedSaveToIni function once every 5 sec.
I would also have a tag isSaveRequested telling this function if it should actually write data to the disk or not.
The _setattr_ function would simply set this tag to true.
Your code would look like this:
class Dict:
def __init__(self):
...
self.isSaveRequested = False
self.timedSaveToIni()
def timedSaveToIni(self):
threading.Timer(5.0, self.timedSaveToIni).start()
if self.isSaveRequested:
self.saveToIni()
self.isSaveRequested = False
def __setattr__(self, key, value):
self.__dict__['d'][key] = value
...
self.isSaveRequested = True
def saveToInit()
...
Perhaps you'd just have the function remember when it was last called and until 5 seconds have passed since the last call just have it block (sleep). That won't really require any sort of threading. Here's an example - just an idea, not specific to your use case.
import time
def timed_func():
while (timed_func.last_use + 5) > time.time():
time.sleep(1)
print "I am working..." # Do the job
timed_func.last_use = time.time()
def main():
timed_func.last_use = time.time() - 5
while True:
timed_func()
if __name__ == '__main__':
main()