Using Thread.join() method with threads that inside inside a class

Using Thread.join() method with threads that inside inside a class - python

I am solving this question from LeetCode: 1116. Print Zero Even Odd
I am running this solution in VS Code with my own main function to understand the issue in depth.
After reading this question and the suggested solutions. In addition to reading this explanation.
I added this code to the code from the solution:
from threading import Semaphore
import threading
def threaded(fn):
def wrapper(*args, **kwargs):
threading.Thread(target=fn, args=args, kwargs=kwargs).start()
return wrapper
and before those functions from the question I added: #threaded
I added a printNumber function and main function to run it on VS Code.
def printNumber(num):
print(num, end="")
if __name__ == "__main__":
a = ZeroEvenOdd(7)
handle = a.zero(printNumber)
handle = a.even(printNumber)
handle = a.odd(printNumber)
Running this code gives me a correct answer but I do not get a new line printed in the terminal after that, I mean for input 7 in my main function, the output is: 01020304050607hostname and not what I want it to be:
01020304050607
hostname
So, I added print("\n") in the main and I saw that I get a random output like:
0102
0304050607
or
0
1020304050607
still without a new line in the end.
When I try to use the join function handle.join() then I get the error:
Exception has occurred: AttributeError 'NoneType' object has no
attribute 'join'
I tried to do this:
handle1 = a.zero(printNumber)
handle2 = a.even(printNumber)
handle3 = a.odd(printNumber)
handle1.join()
handle2.join()
handle3.join()
Still got the same error.
Where in the code should I do the waiting until the threads will terminate?
Thanks.

When I try to use...handle.join()...I get the error: "...'NoneType' object has no attribute, 'join'
The error message means that the value of handle was None at the point in your program where your code tried to call handle.join(). There is no join() operation available on the None value.
You probably wanted to join() a thread (i.e., the object returned by Threading.thread(...). For a single thread, you could do this:
t = Threading.thread(...)
t.start()
...
t.join()
Your program creates three threads, so you won't be able to just use a single variable t. You could use three separate variables, or you could create a list, or... I'll leave that up to you.

Related

multiprocessing.pool.MaybeEncodingError: Error sending result occurs at last object

I keep having an issue when executing a function multiple times at once using the multiprocessing.Pool class.
I am using Python 3.8.3 on Windows 10 with PyCharm 2017.3.
The function I am executing is opening and serialising excel files from my harddisk to custom objects which I want to iterate through later on.
The error always occurs after the last execution of the function.
Here is what it says:
multiprocessing.pool.MaybeEncodingError: Error sending result: '[<IntegListe.IntegrityList object at 0x037481F0>, <IntegListe.IntegrityList object at 0x03D86CE8>, <IntegListe.IntegrityList object at 0x03F50F88>]'. Reason: 'TypeError("cannot pickle '_thread.RLock' object")'
Here is what my code looks like:
from multiprocessing import Pool
p = Pool()
ilList = p.starmap(extract_excel, [(f, spaltennamen) for f in files])
p.join()
p.close()
And this is the function I am trying to execute parallely:
def extract_excel(t: tuple) -> IntegrityList:
file_path = t[0]
spaltennamen = t[1]
il = IntegrityList(file_path)
print(il)
spaltennamen = list(map(lambda x: Excel.HereOrFind(il.ws, x, x.value), spaltennamen)) # Update position of column headers
il.parse_columns(spaltennamen, il.ws)
il.close()
return il
Since I am quite new to python, I am having troubles figuring out the magic behind this multiprocessing error. Executing the function serially is working perfectly fine and I get the desired output. This proofs, that the function and all the sub functions work as expected. I would be glad for any information that could help solve this problem. Thanks!

Okay so for future issue viewers, I solved the error with the help of this website: https://www.synopsys.com/blogs/software-security/python-pickling/#:~:text=Whenever%20an%20object%20is%20pickled,reconstruct%20this%20object%20when%20unpickling..
It states that every customized object that goes through a parallel process needs to have the __reduce__ method implemented in order to be reconstructed.
I simply added this code to my custom object:
def __reduce__(self):
return IntegrityList, (self.file_path,)
After that the execution of the parallel processing works great.

Python: Why does Pool.map() hang when attempting to use the input arguments to its map function?

I have the following function (shortened for readability), which I parallelize using Python's (3.5) multiprocessing module:
def evaluate_prediction(enumeration_tuple):
i = enumeration_tuple[0]
logits_pred = enumeration_tuple[1]
print("This prints succesfully")
print("This never gets printed: ")
print(enumeration_tuple[0])
filename = sample_names_test[i]
onehots_pred = logits_to_onehots(logits_pred)
np.save("/media/nfs/7_raid/ebos/models/fcn/" + channels + "/test/ndarrays/" + filename, onehots_pred)
However, this function hangs whenever I attempt to read its input argument. Execution can get past the logits_pred = enumeration_tuple[1] line, as evidenced by a print statement printing a simple string, but it halts whenever I print(logits_pred). So apparently, whenever I actually need the passed value, the process stops. I do not get an exception or error message. When using either Python's built-in map() function or a for-loop, the function finishes succesfully. I should have sufficient memory en computing power available. All processes are writing to different files. enumerate(predictions) yields correct index-value pairs, as expected. I call this function using Pool.map():
pool = multiprocessing.Pool()
file_results = pool.map(evaluate_prediction, enumerate(predictions))
Why is it hanging? And how can I get an exception, so I know what's going wrong?
UPDATE: After outsourcing the mapped function to another module, importing it from there, and adding __init__.py to my directory, I manage to print the first item in the tuple, but not the second.

I had a similar issue before, and a solution that worked for me was to put the function you want to parallelize in a separate module and then import it.
from eval_prediction import evaluate_prediction
pool = multiprocessing.Pool()
file_results = pool.map(evaluate_prediction, enumerate(predictions))
I assume you will save the function definition inside a filename eval_prediction.py in same directory. Make sure you have __init__.py as well.

python simple threading won't ends without doing anything (maybe)

When i run the following code (using "sudo python servers.py") the process seem to just finish immediately with just printing "test".
why doesn't the functions "proxy_server" won't run ? or maybe they do but i do not realize that. (because the first line in proxy function doesn't print anything)
this is an impotent code, i didn't want to put unnecessary content, yet it still demonstrate my problem:
import os,sys,thread,socket,select,struct,time
HTTP_PORT = 80
FTP_PORT=21
FTP_DATA_PORT = 20
IP_IN = '10.0.1.3'
IP_OUT = '10.0.3.3'
sys_http = 'http_proxy'
sys_ftp = 'ftp_proxy'
sys_ftp_data = 'ftp_data_proxy'
def main():
try:
thread.start_new_thread(proxy_server, (HTTP_PORT, IP_IN,sys_http,http_handler))
thread.start_new_thread(proxy_server, (FTP_PORT, IP_IN,sys_ftp,http_handler))
thread.start_new_thread(proxy_server, (FTP_DATA_PORT, IP_OUT,sys_ftp_data,http_handler))
print "test"
except e:
print 'Error!'
sys.exit(1)
def proxy_server(host,port,fileName,handler):
print "Proxy Server Running on ",host,":",port
def http_handler(src,sock):
return ''
if __name__ == '__main__':
main()
What am i missing or doing wrong ?

First, you have indentation problems related to using mixed tabs and spaces for indentation. While they didn't cause your code to misbehave in this particular case, they will cause you problems later if you don't stick to consistently using one or the other. They've already broken the displayed indentation in your question; see the print "test" line in main, which looks misaligned.
Second, instead of the low-level thread module, you should be using threading. Your problem is occurring because, as documented in the thread module documentation,
When the main thread exits, it is system defined whether the other threads survive. On SGI IRIX using the native thread implementation, they survive. On most other systems, they are killed without executing try ... finally clauses or executing object destructors.
threading threads let you explicitly define whether other threads should survive the death of the main thread, and default to surviving. In general, threading is much easier to use correctly.

Python module function returning immediately in thread

I've written a couple of twitter scrapers in python, and am writing another script to keep them running even if they suffer a timeout, disconnection, etc.
My current solution is as follows:
Each scraper file has a doScrape/1 function in it, which will start up a scraper and run it once, eg:
def doScrape(logger):
try:
with DBWriter(logger=logger) as db:
logger.log_info("starting", __name__)
s = PastScraper(db.getKeywords(), TwitterAuth(), db, logger)
s.run()
finally:
logger.log_info("Done", __name__)
Where run is a near-infinite loop, which won't break unless there is an exception.
In order to run one of each kind of scraper at once, I'm using this code (with a few extra imports):
from threading import Thread
class ScraperThread(Thread):
def __init__(self, module, logger):
super(ScraperThread, self).__init__()
self.module = module # Module should contain a doScrape(logger) function
self.logger = logger
def run(self):
while True:
try:
print "Starting!"
print self.module.doScrape
self.module.doScrape(self.logger)
except: # if for any reason we get disconnected, reconnect
self.logger.log_debug("Restarting scraper", __name__)
if __name__ == "__main__":
with Logger(level="all", handle=open(sys.argv[1], "a")) as l:
past = ScraperThread(PastScraper, l)
stream = ScraperThread(StreamScraper, l)
past.start()
stream.start()
past.join()
stream.join()
However, it appears that my call of doScrape from above is returning immediately, hence "Starting!" is printed in the console repeatedly, and that "Done" message in the finally block is not written to the log, whereas when run individually like so:
if __name__ == "__main__":
# Example instantiation
from Scrapers.Logging import Logger
with Logger(level="all", handle=open(sys.argv[1], "a")) as l:
doScrape(l)
The code runs forever, as expected. I'm a bit stumped.
Is there anything silly that I might have missed?

get rid of the diaper pattern in your run() method, as in: get rid of that catch-all exception handler. You'll probably get the error printed there then. I think there may be something wrong in the DBWriter or other code you're calling from your doScrape function. Perhaps it is not thread-safe. That would explain why running it from the main program directly works, but calling it from a thread fails.

Aha, solved it! It was actually that I didn't realise that a default argument (here in TwitterAuth()) is evaluated at definition time. TwitterAuth reads the API key settings from a file handle, and the default argument opens up the default config file. Since this file handle is generated at definition time, both threads had the same handle, and once one had read it, the other one tried to read from the end of the file, throwing an exception. This is remedied by resetting the file before use, and using a mutex.
Cheers to Irmen de Jong for pointing me in the right direction.

Python not querying database in loop

I currently have a process running that should call a method every 10 seconds. I see that it actually calls the method at that interval, but it seems to not execute something in the code. Weird thing is, is that when I cancel the loop, and start it new it does actually do it the first time. Then when I keep it running it does not do anything.
def main():
try:
while True:
read()
time.sleep(10)
except KeyboardInterrupt:
pass
Above is the loop, and the code here is actually the beginning of the method that is being called, and I found out that it does not actually get results in the results, while the file has changed. In this case it gets data from a .json file
def read():
message = Query()
results = DB.search(message.pushed == False)
Am I overlooking something?

Solved. I had the DB declared globally and that did not go so well. It is being fixed by declaring it just before the statement.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Thread.join() method with threads that inside inside a class - python

Related

multiprocessing.pool.MaybeEncodingError: Error sending result occurs at last object

Python: Why does Pool.map() hang when attempting to use the input arguments to its map function?

python simple threading won't ends without doing anything (maybe)

Python module function returning immediately in thread

Python not querying database in loop

Categories

Resources