SSL error after python/django fork

SSL error after python/django fork - python

I've got a python django app where part of it is parsing a large file. This takes forever, so I put a fork in to deal with the processing, allowing the user to continue to browse the site. Within the fork code, there's a bunch of calls to our postgres database, hosted on amazon.
I'm getting the following error:
SSL error: decryption failed or bad record mac
Here's the code:
pid = os.fork()
if pid == 0:
lengthy_code_here(long)
database_queries(my_database)
os._exit(0)
None of my database calls are working, although they were working just fine before I inserted the fork. After looking around a little, it seems like it might be a stale database connection, but I'm not sure how to fix it. Does anyone have any ideas?

Forking while holding a socket open (such as a database connection) is generally not safe, as both processes will end up trying to use the same socket at once.
You will need, at a minimum, to close and reopen the database connection after forking.
Ideally, though, this is probably better suited for a task queueing system like Celery.

Django in production typically has a process dispatching to a bunch of processes that house django/python. These processes are long running, ie. they do NOT terminate after handling one request. Rather they handle a request, and then another, and then another, etc. What this means is changes that are not restored/cleaned up at the end of servicing a request will affect future requests.
When you fork a process, the child inherits various things from the parent including all open descriptors (file, queue, directories). Even if you do nothing with the descriptors, there is still a problem because when a process dies all it's open descriptors will be cleaned up.
So when you fork from a long running process you are setting yourself up to close all the open descriptors (such as the ssl connection) when the child process dies after it finishes processing. There are ways to prevent this from happening in a fork, but they can sometimes be difficult to get right.
A better design is to not fork, and instead hand off to another process that is either running, or started in a safer manner. For example:
at(1) can be used to queue up jobs for later (or immediate) execution
message queues can be used to pass messages to other daemons
standard IPC constructs such as pipes can be used to communicate to other daemons
update:
If you want to use at(1) you will have to create a standalone script. You can use a serializer to pass the data from django to the script.

Related

Can I allow my server process to restart without killing existing connections?

In an attempt to make my terminal based program survive longer I was told to look into forking the process off of system. I can't find much specifying a PID to which I want to spawn a new process off of.
is this possible in Linux? I am a Windows guy mainly.
My program is going to be dealing with sockets and if my application crashed then I would lose lots of information. I was under the impression that if it was forked from system the sockets would stay alive?
EDIT: Here is what I am trying to do. I have multiple computers that I want to communicate with. So I am building a program that lets me listen on a socket(simple). Then I will connect to it from each of my remote computers(simple).
Once I have a connection I want to open a new terminal, and use my program to start interacting with the remote computer(simple).
The questions came from this portion.. The client shell will send all traffic to the main shell who will then send it out to the remote computer. When a response is received it goes to main shell and forwards it to client shell.
The issue is keeping each client shell in the loop. I want all client shells to know who is connected to who on each client shell. So client shell 1 should tell me if I have a client shell 2, 3, 4, 5, etc and who is connected to it. This jumped into sharing resources between different processes. So I was thinking about using local sockets to send data between all these client shells. But then I ran into a problem if the main shell were to die, everything is lost. So I wanted a way to try and secure it.
If that makes sense.

So, you want to be able to reload a program without losing your open socket connections?
The first thing to understand is that when a process exits, all open file descriptors are closed. This includes socket connections. Running as a daemon does not change that. A process becomes a daemon by becoming independent of your terminal sesssion, so that it will continue to run when your terminal sesssion ends. But, like any other process, when a daemon terminates for any reason (normal exit, crashed, killed, machine is restarted, etc), then all connections to it cease to exist. BTW this is not specific to unix, Windows is the same.
So, the short answer to your question is NO, there's no way to tell unix/linux to not close your sockets when your process stops, it will close them and that's that.
The long answer is, there are a few ways to re-engineer things to get around this:
1) You can have your program exec() itself when you send it a special message or signal (eg SIGHUP). In unix, exec (or its several variants), does not end or start any process, it simply loads code into the current process and starts execution. The new code takes the place of the old within the same process. Since the process remains the same, any open files remain open. However you will lose any data that you had in memory, so the sockets will be open, but your program will know nothing about them. On startup you'd have to use various system calls to discover which descriptors are open in your process and whether any of them are socket connections to clients. One way to get around this would be to pass critical information as command line arguments or environment variables which can be passed through the exec() call and thus preserved for use of the new code when it starts executing.
Keep in mind that this only works when the process calls exec ITSELF while it is still running. So you cannot recover from a crash or any other cause of your process ending.. your connections will be gone. But this method does solve the problem of you wanting to load new code without losing your connections.
2) You can bypass the issue by dividing your server (master) into two processes. The first (call it the "proxy") accepts the TCP connections from the clients and keeps them open. The proxy can never exit, so it should be kept so simple that you'll rarely want to change that code. The second process runs the "worker", which is the code that implements your application logic. All the code you might want to change often should go in the worker. Now all you need do establish interprocess communication from the proxy to the worker, and make sure that if the worker exits, there's enough information in the proxy to re-establish your application state when the worker starts up again. In a really simple, low volume application, the mechanism can be as simple as the proxy doing a fork() + exec() of the worker each time it needs to do something. A fancier way to do this, which I have used with good results, is a unix domain datagram (SOCK_DGRAM) socket. The proxy receives messages from the clients, forwards them to the worker through the datagram socket, the worker does the work, and responds with the result back to the proxy, which in turn forwards it back to the client. This works well because as long as the proxy is running and has opened the unix domain socket, the worker can restart at will. Shared memory can also work as a way to communicate between proxy and worker.
3) You can use the unix domain socket along with the sendmesg() and recvmsg() functions along with the SCM_RIGHTS flag to pass not the client data itself, but to actually send the open socket file descriptors from the old instance to the new. This is the only way to pass open file descriptors between unrelated processes. Using this mechanism, there are all sorts of strategies you can implement.. for example, you could start a new instance of your master program, and have it connect (via a unix domain socket) to the old instance and transfer all the sockets over. Then your old instance can exit. Or, you can use the proxy/worker model, but instead of passing messages through the proxy, you can just have the proxy hand the socket descriptor to the worker via the unix domain socket between them, and then the worker can talk directly to the client using that descriptor. Or, you could have your master send all its socket file descriptors to another "stash" process that holds on to them in case the master needs to restart. There are all sorts of architectures possible. Keeping in mind that the operating system just provides the ability to ship the descriptors around, all the other logic you have to code for yourself.
4) You can accept that no matter how careful you are, inevitably connections will be lost. Networks are unreliable, programs crash sometimes, machines are restarted. So rather than going to significant effort to make sure your connections don't close, you can instead engineer your system to recover when they inevitably do.
The simplest approach to this would be: Since your clients know who they wish to connect to, you could have your client processes run a loop where, if the connection to the master is lost for any reason, they periodically try to reconnect (let's say every 10-30 seconds), until they succeed. So all the master has to do is to open up the rendezvous (listening) socket and wait, and the connections will be re-established from every client that is still out there running. The client then has to re-send any information it has which is necessary to re-establish proper state in the master.
The list of connected computers can be kept in the memory of the master, there is no reason to write it to disk or anywhere else, since when the master exits (for any reason), those connections don't exist anymore. Any client can then connect to your server (master) process and ask it for a list of clients that are connected.
Personally, I would take this last approach. Since it seems that in your system, the connections themselves are much more valuable than the state of the master, being able to recover them in the event of a loss would be the first priority.
In any case, since it seems that the role of the master is to simply pass data back and forth among clients, this would be a good application of "asynchronous" socket I/O using the select() or poll() functions, this allows you to communicate between multiple sockets in one process without blocking. Here's a good example of a poll() based server that accepts multiple connections:
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rzab6/poll.htm
As far as running your process "off System".. in Unix/Linux this is referred to running as a daemon. In *ix, these processes are children of process id 1, the init process.. which is the first process that starts when the system starts. You can't tell your process to become a child of init, this happens automatically when the existing parent exits. All "orphaned" processes are adopted by init. Since there are many easily found examples of writing a unix daemon (at this point the code you need to write to do this has become pretty standardized), I won't paste any code here, but here's one good example I found: http://web.archive.org/web/20060603181849/http://www.linuxprofilm.com/articles/linux-daemon-howto.html#ss4.1
If your linux distribution uses systemd (a recent replacement for init in some distributions), then you can do it as a systemd service, which is systemd's idea of a daemon but they do some of the work for you (for better or for worse.. there's a lot of complaints about systemd.. wars have been fought just about)...

Forking from your own program, is one approach - however a much simpler and easier one is to create a service. A service is a little wrapper around your program that deals with keeping it running, restarting it if it fails and providing ways to start and stop it.
This link shows you how to write a service. Although its specifically for a web server application, the same logic can be applied to anything.
https://medium.com/#benmorel/creating-a-linux-service-with-systemd-611b5c8b91d6
Then to start the program you would write:
sudo systemctl start my_service_name
To stop it:
sudo systemctl stop my_service_name
To view its outputs:
sudo journalctl -u my_service_name

Sharing psycopg2 / libpq connections across processes

According to psycopg2 docs:
libpq connections shouldn’t be used by a forked processes, so when using a module such as multiprocessing or a forking web deploy method such as FastCGI make sure to create the connections after the fork.
Following the link from that document leads to:
On Unix, forking a process with open libpq connections can lead to unpredictable results because the parent and child processes share the same sockets and operating system resources. For this reason, such usage is not recommended, though doing an exec from the child process to load a new executable is safe.
But it seems there's no inherent problem with forking processes with open sockets. So what's the reason for psycopg2's warning against forking when connections are open?
The reason for my question is that I saw a (presumably successful) multiprocessing approach that opened a connection right before forking.
Perhaps it is safe to fork open connections under some restrictions (e.g., only one process actually ever uses the connection, etc.)?

Your surmise is basically correct: there is no issue with a connection being opened before a fork as long as you don't attempt to use it in more than one process.
That being said, I think you misunderstood the "multiprocessing approach" link you provided. It actually demonstrates a separate connection being opened in each child. (There is a connection opened by the parent before forking, but it's not being used in any child.)
The improvement given by the answer there (versus the code in the question) was to refactor so that -- rather than opening a new connection for each task in the queue -- each child process opened a single connection and then shared it across multiple tasks executed within the same child (i.e. the connection is passed as an argument to the Task processor).
Edit:
As a general practice, one should prefer creating a connection within the process that is using it. In the answer cited, a connection was being created in the parent before forking, then used in the child. This does work fine, but leaves each "child connection" open in the parent as well, which is at best a waste of resources and also a potential cause of bugs.

Is it possible to create a long running process in NodeJs

Is it possible to create a long running process in NodeJs to handle many background operations without interrupting the main thread; something like Celery in Python.
Hint, it's highly preferable to be able to manage that long-running process, in case of failure, or need to be restarted, away from the main process.

http://nodejs.org/api/child_process.html is the right API to create long-running processes, you will have complete control over the child processes (access to stdin/out/err, can send signals etc). This approach however requires that your node process is parent of those children.. If you want the child to outlive the parent, take a look at options.detached during child creation (and following child.unref()).
Please note, however, that Node.js is suited extremely well to avoid such architecture. Typically node.js do all the background stuff in the main thread. I've been writing apps with lots of traffic (like thousands requests per second), with DB, Redis and RabbitMQ access all from the main thread and without any child processes - and it was worked fine, as it should, thanks to Node's evented IO system.
I'm generally using child_process api only to launch separate executables (e.g. ffmpeg to transcode some video file), apart of such scenarios separate processes are probably not what you want.
There is also cluster api which allow single master to handle numerous worker processes, though I think it isn't what you look for, either.

You can create child process to handle your background operations. And then use messages to pass data between the new process and your main thread.
http://nodejs.org/api/child_process.html
Update
It looks like you need to use the server queues, sort of beanstalkd http://kr.github.io/beanstalkd/ + https://www.npmjs.com/package/fivebeans.

Prevent a second process from listening to the same pipe in Python

I have a process that connects to a pipe with Python 2.7's multiprocessing.Listener() and waits for a message with recv(). I run it various on Windows 7 and Ubuntu 11.
On Windows, the pipe is called \\.\pipe\some_unique_id. On Ubuntu, the pipe is called /temp/some_unique_id. Other than that, the code is the same.
All works well, until, in an unrelated bug, monit starts a SECOND copy of the same program. It tries to listen to the exact same pipe.
I had naively* expected that the second connection attempt would fail, leaving the first connection unscathed.
Instead, I find the behaviour is officially undefined.
Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time.
On Ubuntu, the earlier copies seem to be ignored, and are left without any messages, while the latest version wins.
On Windows, there is some more complex behaviour. Sometimes the original pipe raises an EOFError exception on the recv() call. Sometimes, both listeners are allowed to co-exist and each message is distributed arbitrarily.
Is there a way to open a pipe exclusively, so the second process cannot open the pipe while the first process hasn't closed it or exited?
* I could have sworn I manually tested this exact scenario, but clearly I did not.
Other SO questions I looked at:
several TCP-servers on the same port - I don't (knowngly) set SO_REUSEADDR
Can two applications listen to the same port?
accept() with sockets shared between multiple processes (based on Apache preforking) - there's no forking involved.

Named pipes have the same access symantics as regular files. Any process with read or write permission can open the pipe for reading or writing.
If you had a way to guarantee that the two instances of the Python script were invoked by processes with differing UID's or GID's, then you can implement unique access control using file permissions.
If both instances of the script have the same UID and GID, you can try file locking implemented in Skip Montanaro's FileLock hosted on github. YMMV.
A simpler way to implement this might be to create a lock file in /var/lock that contains the PID of the process creating the lock file and then check for the existence of the lock file before opening the pipe. This scheme is used by most long-running daemons but has problems when the processes that create the lock files terminate in situations that prevent them from removing the lock file.
You could also try a Python System V semaphore to prevent synchronous access.

starting my own threads within python paste

I'm writing a web application using pylons and paste. I have some work I want to do after an HTTP request is finished (send some emails, write some stuff to the db, etc) that I don't want to block the HTTP request on.
If I start a thread to do this work, is that OK? I always see this stuff about paste killing off hung threads, etc. Will it kill my threads which are doing work?
What else can I do here? Is there a way I can make the request return but have some code run after it's done?
Thanks.

You could use a thread approach (maybe setting the Thead.daemon property would help--but I'm not sure).
However, I would suggest looking into a task queuing system. You can place a task on a queue (which is very fast), then a listener can handle the tasks asynchronously, allowing the HTTP request to return quickly. There are two task queues that I know of for Django:
Django Queue Service
Celery
You could also consider using an more "enterprise" messaging solution, such as RabbitMQ or ActiveMQ.
Edit: previous answer with some good pointers.

I think the best solution is messaging system because it can be configured to not loose the task if the pylons process goes down. I would always use processes over threads especially in this case. If you are using python 2.6+ use the built in multiprocessing or you can always install the processing module which you can find on pypi (I can't post link because of I am a new user).

Take a look at gearman, it was specifically made for farming out tasks to 'workers' to handle. They can even handle it in a different language entirely. You can come back and ask if the task was completed, or just let it complete. That should work well for many tasks.
If you absolutely need to ensure it was completed, I'd suggest queuing tasks in a database or somewhere persistent, then have a separate process that runs through it ensuring each one gets handled appropriately.

To answer your basic question directly, you should be able to use threads just as you'd like. The "killing hung threads" part is paste cleaning up its own threads, not yours.
There are other packages that might help, etc, but I'd suggest you start with simple threads and see how far you get. Only then will you know what you need next.
(Note, "Thread.daemon" should be mostly irrelevant to you here. Setting that true will ensure a thread you start will not prevent the entire process from exiting. Doing so would mean, however, that if the process exited "cleanly" (as opposed to being forced to exit) your thread would be terminated even if it wasn't done its work. Whether that's a problem, and how you handle things like that, depend entirely on your own requirements and design.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.