Python FastCGI under IIS - stdout writing problems - python

I'm having a very peculiar problem in my Python FastCGI code - sys.stdout has a file descriptor of '-1', so I can't write to it.
I'm checking this at the first line of my program, so I know it's not any of my code changing it.
I've tried sys.stdout = os.fdopen(1, 'w'), but anything written there won't get to my browser.
The same application works without difficulty under Apache.
I'm using the Microsoft-provided FastCGI extension for IIS documented here: http://learn.iis.net/page.aspx/248/configuring-fastcgi-extension-for-iis60/
I am using these settings in fcgiext.ini:
ExePath=C:\Python23\python.exe
Arguments=-u C:\app\app_wsgi.py
FlushNamedPipe=1
RequestTimeout=45
IdleTimeout=120
ActivityTimeout=30
Can anyone tell what's wrong or tell me where I should look to find out?
All suggestions greatly appreciated...

Forgive me if this is a dumb question, but I notice this line in your config file:
Arguments=-u C:\app\app_wsgi.py
Are you running a WSGI application or a FastCGI app? There is a difference. In WSGI, writing to stdout isn't a good idea. Your program should have an application object that can be called with an environment dict and a start_response function (for more info, see PEP 333). At any rate, your application's method of returning will be to return an iterable object that contains the response body, not writing to stdout.
Either way, you should also consider using isapi-wsgi. I've never used it myself, but I hear good things about it.

Do you have to use FastCGI? If not, you may want to try a ISAPI WSGI method. I have had success using:
http://code.google.com/p/isapi-wsgi/
and have also used PyISAPIe in the past:
http://sourceforge.net/apps/trac/pyisapie

I believe having stdout closed/invalid is in accordance to the FastCGI spec:
The Web server leaves a single file
descriptor, FCGI_LISTENSOCK_FILENO,
open when the application begins
execution. This descriptor refers to a
listening socket created by the Web
server.
FCGI_LISTENSOCK_FILENO equals
STDIN_FILENO. The standard descriptors
STDOUT_FILENO and STDERR_FILENO are
closed when the application begins
execution. A reliable method for an
application to determine whether it
was invoked using CGI or FastCGI is to
call
getpeername(FCGI_LISTENSOCK_FILENO),
which returns -1 with errno set to
ENOTCONN for a FastCGI application.

On windows, it's possible to launch a proces without a valid stdin and stdout. For example, if you execute a python script with pythonw.exe, the stdout is invdalid and if you insist on writing to it, it will block after 140 characters or something.
Writing to another destination than stdout looks like the safest solution.

Following the PEP 333 you can try to log to environ['wsgi.errors'] which is usualy the logger of the web server itself when you use fastcgi. Of course this is only available when a request is called but not during application startup.
You can get an example in the pylons code: http://pylonshq.com/docs/en/0.9.7/logging/#logging-to-wsgi-errors

Related

File descriptor doesn't update for logging in Django

We use Python(2.7)/Django(1.8.1) and Gunicorn(19.4.5) for our web application and supervisor(3.0) to monitor it. I have recently encountered 2 issues in logging:
Django was logging into previous day logs(We have log rotation enabled)
Django was not logging anything at all.
The first scenario is understandable where the log rotation changed the file but Django was not updated.
The second scenario fixed when I restarted the supervisor process. Which led me to believe again the file descriptor was not updated in the django process.
I came by this SO thread which states:
Each child is an independent process, and file handles in the parent
may be closed in the child after a fork (assuming POSIX). In any case,
logging to the same file from multiple processes is not supported.
So I have few questions:
My gunicorn has 4 child processes and if one of them fails while
writing to a log file will the other child process won't be able to
use it? and how to debug these kind of scenarios?
Personally I found debugging errors in python logging module to be
difficult. Can some one point how to debug errors such as this or is
there any way I can monkey patch logging to not fail silently?*(Kindly read update section)*
I have seen Django LogRotation causes the Issue type 1 as explained above and not some script scheduled via cron. So what is preferable?
Note: The logging config is not a problem. I have already spent fair amount of time trying to figure that out. Also if the config is the issue Django will not write log files after a process restart.
Update:
For my second question I see that logging modules provides an option to raiseExceptions on failure although this is discourages in production environment. Documentation here. So now my question becomes how do I set this in Django?
I felt like closing this question. Bit awkward and seems stupid after 2 months. But I guess being stupid is part of the learning and want this to be as a reference for people who stumble across this.
Scenario 1: Django on using TimedRotatingFileHandler seems not to update the file descriptor some times and hence writes to old log files unless we restart the supervisor. We are yet to find the reason for this behaviour and update the reason if found. For now we are using WatchedFileHandler and then using logrotate utility to rotate the logs.
Scenario 2: This is the stupid question. When I was logging with some string formatting I forgot to give enough variables which is why the logger was erring. But this didn't get propagated. But locally when I was testing I found that logging module was actually throwing that error but silently and any logs after it in the module were not getting printed. Lessons learns from this scenario were:
If there is a problem in logging find out if the string formatting does not err
Using log.debug('example: {msg}'.format(msg=msg)) of python instead of log.debug('example: %s', msg).

Python Seccomp Allow STDIN

I'm working on a project where I will be running potentially malicious code. It's basic organization is that there is a master and a slave process. The slave process runs the potentially malicious code, and has seccomp enabled.
import prctl
prctl.set_seccomp(True)
This is how seccomp is turned on. I can communicate fine FROM the slave TO the master, but not the other way around. When I don't turn on seccomp, I can use:
import sys
lines = sys.stdin.read()
Or something along those lines. I found this quite odd, I should have access to read and write given the default parameters of seccomp, especially for stdin/out. I have even tried opening stdin before I turn on seccomp. For example.
stdinFile = sys.stdin
prctl.set_seccomp(True)
lines = stdinFile.read()
But still to no avail. I have also tried readlines() which doesn't work. A friend suggested that I try Unix Domain Sockets, opening it before seccomp goes on, and then just using the write() call. This didn't work either. If anyone has any suggestions on how to combat this problem, please post them! I have seen some code in C for something like
seccomp_add_rule(stuff)
But I have been unsuccessful at using this in Python with the cffi module.
sys.stdin is not a file handle, you need to open it and get a file handle before calling set_seccomp. You could use os.fdopen for this. The file descriptor for stdin / stdout is available as sys.stdin.fileno().

How to make NSLog work with Python's logging module when using PyObjC?

I'm writing a Django-based webapp that imports a Cocoa framework via PyObjC. The Cocoa framework has NSLog() littered all through it and while I can see them when running the Django server in non-daemon mode, as soon as I go to daemon I simply lose all this useful NSLog() output.
Is there any easy way to get NSLog stuff to bubble up into the Python logging module's world so it can be merged in with the log messages being emitted by the actual Python code?
Did a little Googling and it seems like you might have to redirect stderr and somehow suck it back into Python in order to achieve this, which would be kind of a bummer ...
Any help much appreciated.
According to this page, NSLog basically works like
fprintf(stderr, format_string, args ...);
so you do need to capture / redirect the standard error output. I wrote a post some time ago which might help for Python-only programs, but I would guess that the Cocoa code accesses the process-level file descriptor 2 (stderr) under the covers. So you'll need to do some low-level fiddling around with the process' stderr. Here's an example:
old_stderr = os.dup(sys.stderr.fileno()) # keep a copy
fd = os.open('path/to/mylog', os.O_CREAT | os.O_WRONLY)
os.dup2(fd, sys.stderr.fileno())
# Now, stderr output, including NSLog output, should go to 'path/to/mylog'
...
os.dup2(old_stderr, sys.stderr.fileno())
#stderr restored to its old state
Once you have fd, you can create a file-like object out of it and pass it to StreamHandler, for example, as a means to merging the output from Python code and Cocoa code.

How can I detect what other copy of Python script is already running

I have a script. It uses GTK. And I need to know if another copy of scrip starts. If it starts window will extend.
Please, tell me the way I can detect it.
You could use a D-Bus service. Your script would start a new service if none is found running in the current session, and otherwise send a D-Bus message to the running instace (that can send "anything", including strings, lists, dicts).
The GTK-based library libunique (missing Python bindings?) uses this approach in its implementation of "unique" applications.
You can use a PID file to determine if the application is already running (just search for "python daemon" on Google to find some working implementations).
If you detected that the program is already running, you can communicate with the running instance using named pipes.
The new copy could search for running copies, fire a SIGUSER signal and trigger a callback in your running process that then handles all the magic.
See the signal library for details and the list of things that can go wrong.
I've done that using several ways depending upon the scenario
In one case my script had to listen on a TCP port. So I'd just see if the port was available it'd mean it is a new copy. This was sufficient for me but in certain cases, if the port is already in use, it might be because some other kind of application is listening on that port. You can use OS calls to find out who is listening on the port or try sending data and checking the response.
In another case I used PID file. Just decide a location and a filename, and everytime your script starts, read that file to get a PID. If that PID is running, it means another copy is already there. Otherwise create that file and write your process ID in it. This is pretty simple. If you are using django then you can simply use django's daemonizer: "from django.utils import daemonize". Otherwise you can use this script: http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/

Have a Python CGI call a Perl CGI, passing original info (to limit searching private Mailman archives to logged-in users)

I need to have a Python CGI script do some stuff (a little bit of security checking), and then end up calling a Perl CGI script, passing anything it received (e.g., POST info) onto the Perl script.
For background, my reason for doing this is that I'm trying to integrate Swish searching with Mailman list archives.
Swish searching uses swish.cgi, a Perl script, but because these are private list archives I can't just allow people to call swish.cgi directly as recommended on this page: http://wpkg.org/Integrating_Mailman_with_a_Swish-e_search_engine#Mailman_configuration
I believe what I need to do is have the Mailman "private" cgi-bin file (written in Python) do its regular security checking (which calls a few Mailman/python modules) and THEN call on swish.cgi to do the search (after having verified that the user is on the mailing list).
Essentially, I believe the simplest solution would just be to protect access to the swish.cgi Perl script with a variant of the standard mailman cgi-bin/private Python script.
(I considered the idea that people could search with a non-protected swish.cgi, and people wouldn't be able to view the full results because those posts are already password-protected by default Mailman setup... but the problem is that even showing the Swish post excerpts in the search results could expose confidential information, so I must restrict access to even the search itself to just subscribers.)
If someone has a better idea of how to solve the overall problem without doing the Python-CGI-calls-Perl-CGI I'll be happy to consider that the "answer".
Just know that my goal is to make little (ideally no) changes to the standard Mailman installation. Copying the "private" cgi-bin script (whose source is mailman-2.1.12/Mailman/Cgi/private.py) and making changes to call swish.cgi is cool, but modifying the existing private cgi-bin script wouldn't really be cool.
Here's what I did to test the answer (using os.execv to replace the python script with the perl script, so that the perl script will inherit the python script's environment):
I created a pythontest script with:
import os
os.environ['FOO'] = 'BAR'
mydir = os.path.dirname(os.environ.get('SCRIPT_FILENAME'))
childprog = mydir + '/perltest'
childargs = []
os.execv(childprog, childargs)
Then a perltest script with:
print "Content-type: text/html\n\n";
while (($key,$value) = each %ENV) {
print "<p>$key=$value</p>\n";
}
Then I called http://myserver.com/cgi-bin/pythontest and saw that the environment printout included the custom FOO variable so the child perltest process had successfully inherited all the environment variables.
I'm just going to state the obvious here because I don't have any detailed knowledge about your specific environment.
If your python script is a genuine CGI and not a mod_python script or similar then it is just a regular process spawned to handle the one request. You can use os.execv to replace it with another process (e.g. the perl CGI) and the new process will inherit the current process' environment, stdin, stdout and stderr. This assumes that you don't need to read stdin for your security checks. It may also depend on whether your CGI is running in a restricted environment. execv is potentially dangerous and might be blocked in such an environment.
If you're running from a mod_python environment or if you need to peek at posted data (i.e. stdin) then the execv approach isn't available to you. You have two main alternatives.
You could run the perl CGI directly (e.g. look at the subprocess module) handing it a correct environment and feeding it the correct data to its stdin. You can the spool the returned data from its stdout raw (or cooked if needed) directly back to the web server.
Otherwise, you could make a local web request to run the CGI. This is likely to require a bit less knowledge about the server setup, but a bit more work in the python CGI to make and handle the HTTP request.

Categories