I've developed a set of audio streaming server, all of them are using Twisted, and they are in Python, of course. They work, but a problem keeps troubling me, when I found some bugs there in the running server, or I want add something into the server, I need to stop them and start. Unlike HTTP servers, it's okay to restart them whenever, but not okay with audio streaming servers. Once I restart my streaming server, it means my users will encounter a disconnection.
I did try to setup a manhole (a ssh service for Twisted servers, you can login and type Python code in the console to do something), and connect to the console, reload Python modules on the fly. It works sometimes, but hard to control. You never know how many instances of old class are there in the server, and some of them might be hard to reach, and relationships of class would be very complex. Also, it may works in some situations, but sometimes you really need to restart server, for example, you are running the server with selector reactor, and you want to run it with epoll reactor instead, then you have to restart it. Another example, when the memory usage goes too high, you have to restart them, too.
To build such system, I have an idea comes in my head, I'm thinking is that possible to hand over those connections and data from a process to another. For example:
We have a Server named Broadcasting, and the running instance is under rev.123, and we want replace it with rev.124.
Broadcasting rev.123 is running....
Startup Broadcasting rev.124 ....
Broadcasting rev.124 is stand by
Hand over connections from instance of rev.123 to instance of rev.124
Stop Broadcasting rev. 123 instance
Is this possible? I have no idea that does lifetime of socket handles bound to processes or not, I thought sockets created by a process will be closed when the creator process is killed, but I'm not sure. If it is possible, are there any guidelines or articles for designing such kind of hot code swapping mechanism? And is there something can achieve what I want for Twisted already be done?
Thanks.
I gave a talk about this at PyCon 2004. There's also some effort to add more functionality to help with this to Twisted itself.
Related
In an attempt to make my terminal based program survive longer I was told to look into forking the process off of system. I can't find much specifying a PID to which I want to spawn a new process off of.
is this possible in Linux? I am a Windows guy mainly.
My program is going to be dealing with sockets and if my application crashed then I would lose lots of information. I was under the impression that if it was forked from system the sockets would stay alive?
EDIT: Here is what I am trying to do. I have multiple computers that I want to communicate with. So I am building a program that lets me listen on a socket(simple). Then I will connect to it from each of my remote computers(simple).
Once I have a connection I want to open a new terminal, and use my program to start interacting with the remote computer(simple).
The questions came from this portion.. The client shell will send all traffic to the main shell who will then send it out to the remote computer. When a response is received it goes to main shell and forwards it to client shell.
The issue is keeping each client shell in the loop. I want all client shells to know who is connected to who on each client shell. So client shell 1 should tell me if I have a client shell 2, 3, 4, 5, etc and who is connected to it. This jumped into sharing resources between different processes. So I was thinking about using local sockets to send data between all these client shells. But then I ran into a problem if the main shell were to die, everything is lost. So I wanted a way to try and secure it.
If that makes sense.
So, you want to be able to reload a program without losing your open socket connections?
The first thing to understand is that when a process exits, all open file descriptors are closed. This includes socket connections. Running as a daemon does not change that. A process becomes a daemon by becoming independent of your terminal sesssion, so that it will continue to run when your terminal sesssion ends. But, like any other process, when a daemon terminates for any reason (normal exit, crashed, killed, machine is restarted, etc), then all connections to it cease to exist. BTW this is not specific to unix, Windows is the same.
So, the short answer to your question is NO, there's no way to tell unix/linux to not close your sockets when your process stops, it will close them and that's that.
The long answer is, there are a few ways to re-engineer things to get around this:
1) You can have your program exec() itself when you send it a special message or signal (eg SIGHUP). In unix, exec (or its several variants), does not end or start any process, it simply loads code into the current process and starts execution. The new code takes the place of the old within the same process. Since the process remains the same, any open files remain open. However you will lose any data that you had in memory, so the sockets will be open, but your program will know nothing about them. On startup you'd have to use various system calls to discover which descriptors are open in your process and whether any of them are socket connections to clients. One way to get around this would be to pass critical information as command line arguments or environment variables which can be passed through the exec() call and thus preserved for use of the new code when it starts executing.
Keep in mind that this only works when the process calls exec ITSELF while it is still running. So you cannot recover from a crash or any other cause of your process ending.. your connections will be gone. But this method does solve the problem of you wanting to load new code without losing your connections.
2) You can bypass the issue by dividing your server (master) into two processes. The first (call it the "proxy") accepts the TCP connections from the clients and keeps them open. The proxy can never exit, so it should be kept so simple that you'll rarely want to change that code. The second process runs the "worker", which is the code that implements your application logic. All the code you might want to change often should go in the worker. Now all you need do establish interprocess communication from the proxy to the worker, and make sure that if the worker exits, there's enough information in the proxy to re-establish your application state when the worker starts up again. In a really simple, low volume application, the mechanism can be as simple as the proxy doing a fork() + exec() of the worker each time it needs to do something. A fancier way to do this, which I have used with good results, is a unix domain datagram (SOCK_DGRAM) socket. The proxy receives messages from the clients, forwards them to the worker through the datagram socket, the worker does the work, and responds with the result back to the proxy, which in turn forwards it back to the client. This works well because as long as the proxy is running and has opened the unix domain socket, the worker can restart at will. Shared memory can also work as a way to communicate between proxy and worker.
3) You can use the unix domain socket along with the sendmesg() and recvmsg() functions along with the SCM_RIGHTS flag to pass not the client data itself, but to actually send the open socket file descriptors from the old instance to the new. This is the only way to pass open file descriptors between unrelated processes. Using this mechanism, there are all sorts of strategies you can implement.. for example, you could start a new instance of your master program, and have it connect (via a unix domain socket) to the old instance and transfer all the sockets over. Then your old instance can exit. Or, you can use the proxy/worker model, but instead of passing messages through the proxy, you can just have the proxy hand the socket descriptor to the worker via the unix domain socket between them, and then the worker can talk directly to the client using that descriptor. Or, you could have your master send all its socket file descriptors to another "stash" process that holds on to them in case the master needs to restart. There are all sorts of architectures possible. Keeping in mind that the operating system just provides the ability to ship the descriptors around, all the other logic you have to code for yourself.
4) You can accept that no matter how careful you are, inevitably connections will be lost. Networks are unreliable, programs crash sometimes, machines are restarted. So rather than going to significant effort to make sure your connections don't close, you can instead engineer your system to recover when they inevitably do.
The simplest approach to this would be: Since your clients know who they wish to connect to, you could have your client processes run a loop where, if the connection to the master is lost for any reason, they periodically try to reconnect (let's say every 10-30 seconds), until they succeed. So all the master has to do is to open up the rendezvous (listening) socket and wait, and the connections will be re-established from every client that is still out there running. The client then has to re-send any information it has which is necessary to re-establish proper state in the master.
The list of connected computers can be kept in the memory of the master, there is no reason to write it to disk or anywhere else, since when the master exits (for any reason), those connections don't exist anymore. Any client can then connect to your server (master) process and ask it for a list of clients that are connected.
Personally, I would take this last approach. Since it seems that in your system, the connections themselves are much more valuable than the state of the master, being able to recover them in the event of a loss would be the first priority.
In any case, since it seems that the role of the master is to simply pass data back and forth among clients, this would be a good application of "asynchronous" socket I/O using the select() or poll() functions, this allows you to communicate between multiple sockets in one process without blocking. Here's a good example of a poll() based server that accepts multiple connections:
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rzab6/poll.htm
As far as running your process "off System".. in Unix/Linux this is referred to running as a daemon. In *ix, these processes are children of process id 1, the init process.. which is the first process that starts when the system starts. You can't tell your process to become a child of init, this happens automatically when the existing parent exits. All "orphaned" processes are adopted by init. Since there are many easily found examples of writing a unix daemon (at this point the code you need to write to do this has become pretty standardized), I won't paste any code here, but here's one good example I found: http://web.archive.org/web/20060603181849/http://www.linuxprofilm.com/articles/linux-daemon-howto.html#ss4.1
If your linux distribution uses systemd (a recent replacement for init in some distributions), then you can do it as a systemd service, which is systemd's idea of a daemon but they do some of the work for you (for better or for worse.. there's a lot of complaints about systemd.. wars have been fought just about)...
Forking from your own program, is one approach - however a much simpler and easier one is to create a service. A service is a little wrapper around your program that deals with keeping it running, restarting it if it fails and providing ways to start and stop it.
This link shows you how to write a service. Although its specifically for a web server application, the same logic can be applied to anything.
https://medium.com/#benmorel/creating-a-linux-service-with-systemd-611b5c8b91d6
Then to start the program you would write:
sudo systemctl start my_service_name
To stop it:
sudo systemctl stop my_service_name
To view its outputs:
sudo journalctl -u my_service_name
I've developed a set of audio streaming server, all of them are using Twisted, and they are in Python, of course. They work, but a problem keeps troubling me, when I found some bugs there in the running server, or I want add something into the server, I need to stop them and start. Unlike HTTP servers, it's okay to restart them whenever, but not okay with audio streaming servers. Once I restart my streaming server, it means my users will encounter a disconnection.
I did try to setup a manhole (a ssh service for Twisted servers, you can login and type Python code in the console to do something), and connect to the console, reload Python modules on the fly. It works sometimes, but hard to control. You never know how many instances of old class are there in the server, and some of them might be hard to reach, and relationships of class would be very complex. Also, it may works in some situations, but sometimes you really need to restart server, for example, you are running the server with selector reactor, and you want to run it with epoll reactor instead, then you have to restart it. Another example, when the memory usage goes too high, you have to restart them, too.
To build such system, I have an idea comes in my head, I'm thinking is that possible to hand over those connections and data from a process to another. For example:
We have a Server named Broadcasting, and the running instance is under rev.123, and we want replace it with rev.124.
Broadcasting rev.123 is running....
Startup Broadcasting rev.124 ....
Broadcasting rev.124 is stand by
Hand over connections from instance of rev.123 to instance of rev.124
Stop Broadcasting rev. 123 instance
Is this possible? I have no idea that does lifetime of socket handles bound to processes or not, I thought sockets created by a process will be closed when the creator process is killed, but I'm not sure. If it is possible, are there any guidelines or articles for designing such kind of hot code swapping mechanism? And is there something can achieve what I want for Twisted already be done?
Thanks.
I gave a talk about this at PyCon 2004. There's also some effort to add more functionality to help with this to Twisted itself.
I need to run a server side script like Python "forever" (or as long as possible without loosing state), so they can keep sockets open and asynchronously react to events like data received. For example if I use Twisted for socket communication.
How would I manage something like this?
Am I confused? or are there are better ways to implement asynchronous socket communication?
After starting the script once via Apache server, how do I stop it running?
If you are using twisted then it has a whole infrastructure for starting and stopping daemons.
http://twistedmatrix.com/projects/core/documentation/howto/application.html
How would I manage something like this?
Twisted works well for this, read the link above
Am I confused? or are there are better ways to implement asynchronous socket communication?
Twisted is very good at asynchronous socket communications. It is hard on the brain until you get the hang of it though!
After starting the script once via Apache server, how do I stop it running?
The twisted tools assume command line access, so you'd have to write a cgi wrapper for starting / stopping them if I understand what you want to do.
You can just write an script that is continuously in a while block waiting for the connection to happen and waits for a signal to close it.
http://docs.python.org/library/signal.html
Then to stop it you just need to run another script that sends that signal to him.
You can use a ‘double fork’ to run your code in a new background process unbound to the old one. See eg this recipe with more explanatory comments than you could possibly want.
I wouldn't recommend this as a primary way of running background tasks for a web site. If your Python is embedded in an Apache process, for example, you'll be forking more than you want. Better to invoke the daemon separately (just under a similar low-privilege user).
After starting the script once via Apache server, how do I stop it running?
You have your second fork write the process number (pid) of the daemon process to a file, and then read the pid from that file and send it a terminate signal (os.kill(pid, signal.SIG_TERM)).
Am I confused?
That's the question! I'm assuming you are trying to have a background process that responds on a different port to the web interface for some sort of unusual net service. If you merely talking about responding to normal web requests you shoudn't be doing this, you should rely on Apache to handle your sockets and service one request at a time.
I think Comet is what you're looking for. Make sure to take a look at Tornado too.
You may want to look at FastCGI, it sounds exactly like what you are looking for, but I'm not sure if it's under current development. It uses a CGI daemon and a special apache module to communicate with it. Since the daemon is long running, you don't have the fork/exec cost. But as a cost of managing your own resources (no automagic cleanup on every request)
One reason why this style of FastCGI isn't used much anymore is there are ways to embed interpreters into the Apache binary and have them run in server. I'm not familiar with mod_python, but i know mod_perl has configuration to allow long running processes. Be careful here, since a long running process in the server can cause resource leaks.
A general question is: what do you want to do? Why do you need this second process, but yet somehow controlled by apache? Why can'ty ou just build a daemon that talks to apache, why does it have to be controlled by apache?
I'm building a program that has a class used locally, but I want the same class to be used the same way over the network. This means I need to be able to make synchronous calls to any of its public methods. The class reads and writes files, so I think XML-RPC is too much overhead. I created a basic rpc client/server using the examples from twisted, but I'm having trouble with the client.
c = ClientCreator(reactor, Greeter)
c.connectTCP(self.host, self.port).addCallback(request)
reactor.run()
This works for a single call, when the data is received I'm calling reactor.stop(), but if I make any more calls the reactor won't restart. Is there something else I should be using for this? maybe a different twisted module or another framework?
(I'm not including the details of how the protocol works, because the main point is that I only get one call out of this.)
Addendum & Clarification:
I shared a google doc with notes on what I'm doing. http://docs.google.com/Doc?id=ddv9rsfd_37ftshgpgz
I have a version written that uses fuse and can combine multiple local folders into the fuse mount point. The file access is already handled within a class, so I want to have servers that give me network access to the same class. After continuing to search, I suspect pyro (http://pyro.sourceforge.net/) might be what I'm really looking for (simply based on reading their home page right now) but I'm open to any suggestions.
I could achieve similar results by using an nfs mount and combining it with my local folder, but I want all of the peers to have access to the same combined filesystem, so that would require every computer to bee an nfs server with a number of nfs mounts equal to the number of computers in the network.
Conclusion:
I have decided to use rpyc as it gave me exactly what I was looking for. A server that keeps an instance of a class that I can manipulate as if it was local. If anyone is interested I put my project up on Launchpad (http://launchpad.net/dstorage).
If you're even considering Pyro, check out RPyC first, and re-consider XML-RPC.
Regarding Twisted: try leaving the reactor up instead of stopping it, and just ClientCreator(...).connectTCP(...) each time.
If you self.transport.loseConnection() in your Protocol you won't be leaving open connections.
For a synchronous client, Twisted probably isn't the right option. Instead, you might want to use the socket module directly.
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((self.host, self.port))
s.send(output)
data = s.recv(size)
s.close()
The recv() call might need to be repeated until you get an empty string, but this shows the basics.
Alternatively, you can rearrange your entire program to support asynchronous calls...
Why do you feel that it needs to be synchronous?
If you want to ensure that only one of these is happening at a time, invoke all of the calls through a DeferredSemaphore so you can rate limit the actual invocations (to any arbitrary value).
If you want to be able to run multiple streams of these at different times, but don't care about concurrency limits, then you should at least separate reactor startup and teardown from the invocations (the reactor should run throughout the entire lifetime of the process).
If you just can't figure out how to express your application's logic in a reactor pattern, you can use deferToThread and write a chunk of purely synchronous code -- although I would guess this would not be necessary.
If you are using Twisted you should probably know that:
You will not be making synchronous calls to any network service
The reactor can only ever be run once, so do not stop it (by calling reactor.stop()) until your application is ready to exit.
I hope this answers your question. I personally believe that Twisted is exactly the correct solution for your use case, but that you need to work around your synchronicity issue.
Addendum & Clarification:
Part of what I don't understand is
that when I call reactor.run() it
seems to go into a loop that just
watches for network activity. How do I
continue running the rest of my
program while it uses the network? if
I can get past that, then I can
probably work through the
synchronicity issue.
That is exactly what reactor.run() does. It runs a main loop which is an event reactor. It will not only wait for entwork events, but anything else you have scheduled to happen. With Twisted you will need to structure the rest of your application in a way to deal with its asynchronous nature. Perhaps if we knew what kind of application it is, we could advise.
For quite a long time I've wanted to start a pet project that will aim in
time to become a web hosting control panel, but mainly focused on Python hosting --
meaning I would like to make a way for users to generate/start Django/
other frameworks projects right from the panel. I seemed to have
found the perfect tool to build my app with it: CherryPy.
This would allow me to do it the way I want, building the app with its own HTTP/
HTTPS server and also all in my favorite programming language.
But now a new question arises: As CherryPy is a threaded server, will
it be the right for this kind of task?
There will be lots of time consuming tasks so if one of the
tasks blocks, the rest of the users trying to access other pages will
be left waiting and eventually get timed out.
I imagine that this kind of problem wouldn't happen on a fork based server.
What would you advise?
"Threaded" and "Fork based" servers are equivalent. A "threaded" server has multiple threads of execution, and if one blocks then the others will continue. A "Fork based" server has multiple processes executing, and if one blocks then the others will continue. The only difference is that threaded servers by default will share memory between the threads, "fork based" ones by default will not share memory.
One other point - the "subprocess" module is not thread safe, so if you try to use it from CherryPy you will get wierd errors. (This is Python Bug 1731717)