Multithread call inside Twisted _delayedRender of request

Multithread call inside Twisted _delayedRender of request - python

I have simple Twisted webserver application serving my math requests. Everything working fine (I hide big code pieces which is not related to my question):
#import section ...
class PlsPage(Resource):
isLeaf = True
def render_POST(self, request):
reactor.callLater(0, self._delayedRender, request)
return NOT_DONE_YET
def _delayedRender(self, request):
#some actions before
crossval_scores = cross_validation.cross_val_score(pls1, X, y=numpy.asarray(Y), scoring=my_custom_scorer, cv=KFold(700, n_folds=700))
#some actions after
request.finish()
reactor.listenTCP(12000, server.Site(PlsPage()))
reactor.run()
When I try to speed up cross_validation calculation by setting n_jobs for example to 3.
crossval_scores = cross_validation.cross_val_score(pls1, X, y=numpy.asarray(Y), scoring=my_custom_scorer, cv=KFold(700, n_folds=700), n_jobs=3)
and after that I got exactly 3 exceptions:
twisted.internet.error.CannotListenError: Couldn't listen on any:12000: [Errno 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted.
For some reasons I can't call cross_val_score with n_jobs > 1 inside _delayedRender.
Here is a traceback of exception, for some reasons reactor.listenTCP trying to start 3 times too.
Any ideas how to get it work?
UPD1. I create file PLS.py and moved all the code here, except last 2 lines:
from twisted.web import server
from twisted.internet import reactor, threads
import PLS
reactor.listenTCP(12000, server.Site(PLS.PlsPage()))
reactor.run()
But the problem still persists. I also found that this problem persists only on Windows. My Linux machine run this scripts well.

scikit_learn apparently uses the multiprocessing module in order to achieve concurrency. The multiprocessing transmits data between processes using pickle, which, among other... idiosyncratic problems that it causes, will cause some of the modules imported in your parent process to be imported in your worker processes.
Your PLS_web.py "module", however, is not actually a module, it's a script; since you have put reactor.listenTCP and reactor.run at the bottom of it, it actually does stuff when you import it rather than just loading its code.
This particular error is because since your web server is being run 4 times (once for the controller process, once for each of the three jobs), each of the 3 times beyond the first encounter an error because the first server is already listening on port 12000.
You should remove the reactor.run/reactor.listenTCP lines elsewhere, into a top level script. A good rule of thumb is that these lines should never appear in the same file as a class or def statement; define your code in one place and start it up in another. Once you've moved it to a file that doesn't get imported (and you might want to even put it in a file whose name isn't a legal module identifier, like run-my-server.py) then multiprocessing might be able to import all the code it needs and do its job.
Better yet, don't write those lines at all, write a twisted application plugin and run your program with twistd. If you don't have to put the reactor.run statement in any place, you can't put it in the wrong place :).

Related

Flask unit tests not closing port after each test

I'm doing some unit testing for a flask application. A part of this includes restarting the flask application for each test. To do this, I'm creating my flask application in the setUp() function of my unitest.TestCase, so that I get the application in its fresh state for each run. Also, I'm starting the application in a separate thread so the tests can run without the flask application blocking.
Example below:
import requests
import unittest
from threading import Thread
class MyTest(unittest.TestCase):
def setUp(self):
test_port = 8000
self.test_url = f"http://0.0.0.0:{str(test_port)}"
self.app_thread = Thread(target=app.run, kwargs={"host": "0.0.0.0", "port": test_port, "debug": False})
self.app_thread.start()
def test_a_test_that_contacts_the_server(self):
response = requests.post(
f"{self.test_url}/dosomething",
json={"foo": "bar"},
headers=foo_bar
)
is_successful = json.loads(response.text)["isSuccessful"]
self.assertTrue(is_successful, msg=json.loads(response.text)["message"])
def tearDown(self):
# what should I do here???
pass
This becomes problematic because when the tests that come after the initial test run, they run into an issue with port 8000 being used. This raises OSError: [Errno 98] Address already in use.
(For now, I've built a workaround, where I generate a list of high ranged ports, and another list of ports used per test, so that I never select a port used by a previous test. This work around works, but I'd really like to know the proper way to shut down this flask application, ultimately closing the connection and releasing/freeing that port.)
I'm hopeful that there is a specific way to shutdown this flask application in the tearDown() function.
How should I go about shutting down the flask application in my tearDown() method?

I found the solution to my own question while writing it, and since it's encouraged to answer your own question on Stack Overflow, I'd like to still share this for anyone else with the same issue.
The solution to this problem is to treat the flask application as another process instead of a thread. This is accomplished using Process from the multiprocessing module en lieu of Thread from the threading module.
I came to this conclusion after reading this Stack Overflow answer regarding stopping flask without using CTRL + C. Reading that answer then lead me to read about the differences between multiprocessing and threading in this Stack Overflow answer. Of course, after that, I moved on to the official documentation on the multiprocessing module, found here. More specifically, this link will take you straight to the Process class.
I'm not able to fully articulate why the multiprocessing module serves this purpose better than threading, but I do feel that it makes more sense for this application. After all, the flask application is acting as its own API server that is separate from my test, and my test is testing the calls to it/responses it gets back. For this reason, I think it makes the most sense for my flask application to be its own process.
tl;dr
Use multiprocessing.Process en lieu of threading.Thread, and then call Process.terminate() to kill the process, followed by Process.join() to block until the process is terminated.
example:
import requests
import unittest
from multiprocessing import Process
class MyTest(unittest.TestCase):
def setUp(self):
test_port = 8000
self.test_url = f"http://0.0.0.0:{str(test_port)}"
self.app_process = Process(target=app.run, kwargs={"host": "0.0.0.0", "port": test_port, "debug": False})
self.app_process.start()
def test_a_test_that_contacts_the_server(self):
response = requests.post(
f"{self.test_url}/dosomething",
json={"foo": "bar"},
headers=foo_bar
)
is_successful = json.loads(response.text)["isSuccessful"]
self.assertTrue(is_successful, msg=json.loads(response.text)["message"])
def tearDown(self):
self.app_process.terminate()
self.app_process.join()
Test early, and test often!

Guaranteeing calling to destruction on process termination

After reading A LOT of data on the subject I still couldn't find any actual solution to my problem (there might not be any).
My problem is as following:
In my project I have multiple drivers working with various hardware's (IO managers, programmable loads, power supplies and more).
Initializing connection to these hardware's is costly (in time), and I cant open and then close the connection for every communication iteration between us.
Meaning I cant do this (Assuming programmable load implements enter / exit):
start of code...
with programmable_load(args) as program_instance:
programmable_load_instance.do_something()
rest of code...
So I went for a different solution :
class programmable_load():
def __init__(self):
self.handler = handler_creator()
def close_connection(self):
self.handler.close_connection()
self.handler = None
def __del__(self):
if (self.handler != None):
self.close_connection()
For obvious reasons I dont 'trust' the destructor to actually get called so I explicitly call close_connection() when I want to end my program (for all drivers).
The problem happens when I abruptly terminate the process, for example when I run via debug mode and quit debugging.
In these cases the process terminates without running through any destructors.
I understand that the OS will clear all memory unused at this point, but is there any way to clear the memory in an organized manner?
and if not, is there a way to make the quit debugging function pass through a certain set of functions? Does the python process know it got a quite debugging event or does it treat it as a normal termination?
Operating system: Windows

According to this documentation:
If a process is terminated by TerminateProcess, all threads of the
process are terminated immediately with no chance to run additional
code.
(Emphasis mine.) This implies that there is nothing you can do in this case.
As detailed here, signals don't work very well on ms-windows.

As was mentioned in a comment, you could use atexit to do the cleanup. But that only works if the process is asked to close (e.g. QUIT signal on Linux) and not just killed (as is likely the case when stopping the debugging session). Similarily if you force your computer to turn off (e.g. long press power button or remove power) then it won't be called either. There is no 'solution' to that for obvious reasons. Your program can't expect to be called when the power suddenly goes off or when it is forcefully killed. The point of forcefully killing is to definitely kill the process now. If it first called your clean-up code then you could delay that which defeats the purpose. That is why there are signals such as to ask your process to stop. This is not Python specific. The same concept also applies across operating systems.
Bonus (design suggestion, not a solution): I would argue that you can still make use of the context manager (using with). Your problem is not unique. Database connections are usually kept alive for longer as well. It is a question of the scope. Move the context further up to the application level. Then it is clear what the boundary is and you don't need any magic (you are probably also aware of #contextmanager to make that a breeze).

I haven't tested properly as I don't have wingide installed over here so I can't grant you this will work but what about using setconsolectrlhandler? For instance, try something like this:
import os
import sys
import win32api
if __name__ == "__main__":
def callback(sig, func=None):
print("Exit handler called!")
try:
win32api.SetConsoleCtrlHandler(callback, True)
except Exception as e:
print("Captured exception", e)
sys.exit(1)
print("Press to quit")
input()
print("Bye!")
It'll be able to handle CTRL+C and CTRL+BREAK signals:

Communication between Python Scripts

I have 2 python scripts. 1st is Flask server and 2nd Is NRF24L01 receiver/transmitter(On Raspberry Pi3) script. Both scripts are running at the same time. I want to pass variables (variables are not constant) between these 2 scripts. How I can do that in a simplest way?

How about a python RPC setup? I.e. Run a server on each script, and each script can also be a client to invoke Remote Procedure Calls on each other.
https://docs.python.org/2/library/simplexmlrpcserver.html#simplexmlrpcserver-example

I'd like to propose a complete solution basing on Sush's proposition. For last few days I've been struggling with the problem of communicating between two processes run separately (in my case - on the same machine). There are lots of solutions (Sockets, RPC, simple RPC or other servers) but all of them had some limitations. What worked for me was a SimpleXMLRPCServer module. Fast, reliable and better than direct socket operations in every aspect. Fully functioning server which can be cleanly closed from client is just as short:
from SimpleXMLRPCServer import SimpleXMLRPCServer
quit_please = 0
s = SimpleXMLRPCServer(("localhost", 8000), allow_none=True) #allow_none enables use of methods without return
s.register_introspection_functions() #enables use of s.system.listMethods()
s.register_function(pow) #example of function natively supported by Python, forwarded as server method
# Register a function under a different name
def example_method(x):
#whatever needs to be done goes here
return 'Enterd value is ', x
s.register_function(example_method,'example')
def kill():
global quit_please
quit_please = 1
#return True
s.register_function(kill)
while not quit_please:
s.handle_request()
My main help was 15 years old article found here.
Also, a lot of tutorials use s.server_forever() which is a real pain to be cleanly stopped without multithreading.
To communicate with the server all you need to do is basically 2 lines:
import xmlrpclib
serv = xmlrpclib.ServerProxy('http://localhost:8000')
Example:
>>> import xmlrpclib
>>> serv = xmlrpclib.ServerProxy('http://localhost:8000')
>>> serv.example('Hello world')
'Enterd value is Hello world'
And that's it! Fully functional, fast and reliable communication. I am aware that there are always some improvements to be done but for most cases this approach will work flawlessly.

Python-Twisted Reactor Starting too Early

I have an application that uses PyQt4 and python-twisted to maintain a connection to another program. I am using "qt4reactor.py" as found here. This is all packaged up using py2exe. The application works wonderfully for 99% of users, but one user has reported that networking is failing completely on his Windows system. No other users report the issue, and I cannot replicate it on my own Windows VM. The user reports no abnormal configuration.
The debugging logs show that the reactor.connectTCP() call is executing immediately, even though the reactor hasn't been started yet! There's no mistaking run order because this is a single-threaded process with 60 sec of computation and multiple log messages between this line and when the reactor is supposed to start.
There's a lot of code, so I am only putting in pseudo-code, hoping that there is a general solution for this issue. I will link to the actual code below it.
import qt4reactor
qt4reactor.install()
# Start setting up main window
# ...
from twisted.internet import reactor
# Separate listener for detecting/processing multiple instances
self.InstanceListener = ListenerFactory(...)
reactor.listenTCP(LISTEN_PORT, self.InstanceListener)
# The active/main connection
self.NetworkingFactory = ClientFactory(...)
reactor.connectTCP(ACTIVE_IP, ACTIVE_PORT, self.NetworkingFactory)
# Finish setting up main window
# ...
from twisted.internet import reactor
reactor.runReturn()
The code is nested throughout the Armory project files. ArmoryQt.py (containing the above code) and armoryengine.py (containing the ReconnectingClientFactory subclass used for this connection).
So, the reactor.connectTCP() call executes immediately. The client code executes the send command and then immediately connectionLost() gets called. It does not appear to try to reconnect. It also doesn't throw any errors other than connectionLost(). Even more mysteriously, it receives messages from the remote node later on, and this app even processes them! But it believes it's not connected (and handshake never finished, so the remote node shouldn't be sending messages, but might be a bug/oversight in that program).
What on earth is going on!? How could the reactor get started before I tell it to start? I searched the code and found no other code that (I believe) could start the reactor.

The API that you're looking for is twisted.internet.reactor.callWhenRunning.
However, it wouldn't hurt to have less than 60 seconds of computation at startup, either :). Perhaps you should spread that out, or delegate it to a thread, if it's relatively independent?

Only one python program running (like Firefox)?

When I open Firefox, then run the command:
firefox http://somewebsite
the url opens in a new tab of Firefox (same thing happens with Chromium as well). Is there some way to replicate this behavior in Python? For example, calling:
processStuff.py file/url
then calling:
processStuff.py anotherfile
should not start two different processes, but send a message to the currently running program. For example, you could have info in one tabbed dialog box instead of 10 single windows.
Adding bounty for anyone who can describe how Firefox/Chromium do this in a cross-platform way.

The way Firefox does it is: the first instance creates a socket file (or a named pipe on Windows). This serves both as a way for the next instances of Firefox to detect and communicate with the first instance, and forward it the URL before dying. A socket file or named pipe being only accessible from processes running on the local system (as files are), no network client can have access to it. As they are files, firewalls will not block them either (it's like writing on a file).
Here is a naive implementation to illustrate my point. On first launch, the socket file lock.sock is created. Further launches of the script will detect the lock and send the URL to it:
import socket
import os
SOCKET_FILENAME = 'lock.sock'
def server():
print 'I\'m the server, creating the socket'
s = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
s.bind(SOCKET_FILENAME)
try:
while True:
print 'Got a URL: %s' % s.recv(65536)
except KeyboardInterrupt, exc:
print 'Quitting, removing the socket file'
s.close
os.remove(SOCKET_FILENAME)
def client():
print 'I\'m the client, opening the socket'
s = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
s.connect(SOCKET_FILENAME)
s.send('http://stackoverflow.com')
s.close()
def main():
if os.path.exists(SOCKET_FILENAME):
try:
client()
except (socket.error):
print "Bad socket file, program closed unexpectedly?"
os.remove(SOCKET_FILENAME)
server()
else:
server()
main()
You should implement a proper protocol (send proper datagrams instead of hardcoding the length for instance), maybe using SocketServer, but this is beyond this question. The Python Socket Programming Howto might also help you. I have no Windows machine available, so I cannot confirm that it works on that platform.

You could create a data directory where you create a "locking file" once your program is running, after having checked if the file doesn't exist yet.
If it exists, you should try to communicate with the existing process, which creates a socket or a pipe or something like this and communicates its address or its path in an appropriate way.
There are many different ways to do so, depending on which platform the program runs.

While I doubt this is how Firefox / Chrome does it, it would be possible to archive your goal with out sockets and relying solely on the file system. I found it difficult to put into text, so see below for a rough flow chart on how it could be done. I would consider this approach similar to a cookie :). One last thought on this is that with this it could be possible to store workspaces or tabs across multiple sessions.
EDIT
Per a comment, environment variables are not shared between processes. All of my work thus far has been a single process calling multiple modules. Sorry for any confusion.

I think you could use multiprocessing connections with a subprocess to accomplish this. Your script would just have to try to connect to the "remote" connection on localhost and if it's not available then it could start it.

Very Basic is use sockets.
http://wiki.python.org/moin/ParallelProcessing
Use Threading, http://www.valuedlessons.com/2008/06/message-passing-conccurrency-actor.html
Example for Socket Programming: http://code.activestate.com/recipes/52218-message-passing-with-socket-datagrams/

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multithread call inside Twisted _delayedRender of request - python

Related

Flask unit tests not closing port after each test

Guaranteeing calling to destruction on process termination

Communication between Python Scripts

Python-Twisted Reactor Starting too Early

Only one python program running (like Firefox)?

Categories

Resources