Apache2.4 is finally running python script... but why?

Apache2.4 is finally running python script... but why? - python

It took me quite a while to figure out how to make Apache2.4 run my "Hello, world!" python script. I have finally figured out what sequence of commands I have to run in the command line for the script to work. Unfortunately, I still don't understand what is happening when I run those commands. I would like to know why they make my script work. I know it's all in the documentation, but so far I find it a bit hard to comprehend what's written there.
Here goes the list of commands I used.
sudo apt-get install apache2
sudo a2dismod mpm_event
sudo a2enmod mpm_prefork
sudo service apache2 restart
sudo a2enmod cgi
sudo service apache2 restart
Any comments on steps 2, 3 and 5 would be highly appreciated.
After that I create script.py in /usr/lib/cgi-bin:
#! /usr/bin/python
print "Content-type: text/html\n\n"
print "Hello, world!"
For some reason the first two lines of the script.py are absolutely necessary. There is no way the code is going to run without them.
And finally I run:
sudo chmod +x /usr/lib/cgi-bin/script.py #why do I need this? how come it is not executable by default?
sudo service apache2 restart
When I call http://localhost/cgi-bin/script.py I get my Hello, world!
I didn't even have to modify apache2.conf, serve-cgi-bin.conf or 000-default.conf
If there is a more obvious/better/correct way to run a python script using Apache24, I would really love to learn it.
P.S. Some people recommend adding AddHandler cgi-script .py .cgi to /etc/apache2/conf-enabled/serve-cgi-bin.conf if you encounter a problem when running a script on Apache. But for some reason it doesn't make any difference in my case. Why?
P.P.S. I use Ubuntu 14.04.

The event Multi-Processing Module (MPM) is designed to allow more
requests to be served simultaneously by passing off some processing
work to supporting threads, freeing up the main threads to work on new
requests.
http://httpd.apache.org/docs/2.2/mod/event.html
This Multi-Processing Module (MPM) implements a non-threaded,
pre-forking web server. Each server process may answer incoming
requests, and a parent process manages the size of the server pool. It
is appropriate for sites that need to avoid threading for
compatibility with non-thread-safe libraries. It is also the best MPM
for isolating each request, so that a problem with a single request
will not affect any other.
http://httpd.apache.org/docs/current/mod/prefork.html
5)
a2enmod is a script that enables the specified module within
the apache2 configuration.
http://manpages.ubuntu.com/manpages/lucid/man8/a2enmod.8.html
The name a2enmod stands for apache2 enable module.
For some reason the first two lines of the script.py are absolutely
necessary.
The first one tells apache how to execute your cgi script. After all, there are other server side languages, like php, perl, ruby, etc. How is apache supposed to know which server side language you are using?
The second line outputs an HTTP header, which is the simplest header you can use. The headers are required to be output before the body of the request--that's the way the http protocol works.
sudo chmod +x /usr/lib/cgi-bin/script.py
why do I need this? how come it is not executable by default?
A file cannot be executed unless an administrator has given permission to do so. That is for security reasons.
If there is a more obvious/better/correct way to run a python script
using Apache24, I would really love to learn it.
Most of the commands you listed are to setup the apache configuration. You shouldn't have to run those commands every time you execute a cgi script. Once apache is configured, all you should have to do is start apache, then request a web page.
P.S. Some people recommend adding:
AddHandler cgi-script .py .cgi
to /etc/apache2/conf-enabled/serve-cgi-bin.conf if you encounter a
problem when running a script on Apache. But for some reason it
doesn't make any difference in my case. Why?
See here:
AddHandler handler-name extension [extension]
Files having the name extension will be served by the specified
handler-name. This mapping is added to any already in force,
overriding any mappings that already exist for the same extension. For
example, to activate CGI scripts with the file extension .cgi, you
might use:
AddHandler cgi-script .cgi
Once that has been put into your httpd.conf file, any file containing
the .cgi extension will be treated as a CGI program.
http://httpd.apache.org/docs/2.2/mod/mod_mime.html#addhandler
So it appears that when you add the AddHandler line it is overriding a configuration setting somewhere that does the same thing.
Response to comment:
ScriptInterpreterSource Directive
This directive is used to control how Apache httpd finds the
interpreter used to run CGI scripts. The default setting is Script.
This causes Apache httpd to use the interpreter pointed to by the
shebang line (first line, starting with #!) in the script
http://httpd.apache.org/docs/current/mod/core.html
On the same page, there is this directive:
CGIMapExtension Directive
This directive is used to control how Apache httpd finds the
interpreter used to run CGI scripts. For example, setting
CGIMapExtension sys:\foo.nlm .foo will cause all CGI script files with
a .foo extension to be passed to the FOO interpreter.

The mpm stands for Multi-Processing Module; basically you replaced the event based approach with the prefork; this is used internally by Apache and often does not affect anything beyond performance (each of these have different performance characteristics), but some things are not compatible with some MPM's and then you need to change them.
The cgi module is an additional module that provides the Common Gateway Interface; it is not included in Apache by default anymore.
The first line of the script is the shebang; it tells Unix/Linux kernel what program to use as the interpreter; that is; "use /usr/bin/python to run this file please". The file extensions do not mean anything in *nix w.r.t executability.
The second line are the headers. The CGI specification says that the output shall be headers followed by an empty line, followed by the content. 1 header is mandatory: the Content-Type. Here you are telling the webserver and browser, that what follows is a document of type text/html. '\n' stands for a newline. (Technically you should write
print "Content-type: text/html\n\n",
with a comma there, otherwise you get one newline too much).
Files in *nix don't have the +x execute bit on by default - this is a security feature; a conscious decision is required to make something executable.
As for the preferred method, since you control the server, use the Apache mod_wsgi with any web framework - Pyramid, Flask, Django, etc; WSGI applications are much more efficient than the CGI.

Related

Error with Python subprocess when running Flask app using nginx + WSGI

I have developed a Python web server using Flask, and some of the endpoints make use of the subprocess module to call different executables. On development, using the Flask debug server, everything works fine. However, when running the server along with nginx+WSGI (on the exact same machine), some subprocess calls fail.
For example, one of the tools I'm using is Microsoft's dotnet, which I installed from my user as sudo apt-get install -y aspnetcore-runtime-5.0 and is then called from Python with the subprocess module. When I run the server with python3 server.py, it works like a charm. However, when using nginx and WSGI, the subprocess call fails with an exception that says: /bin/sh: 1: dotnet: not found.
I suspect this is due to the command not being accessible to the user and group running the server. I have used this guide as a reference to deploy the app, and on the wsgi .ini file, I have set uid = javierd and gid = www-data, while on the systemd .service file I have User=javierd, Group=www-data.
I have tried to add the executables' paths to /etc/profile, but it didn't work, and I don't know any other way to fix it. I find also very surprising that this happens to some executables, but not to all, and that it happes to dotnet, for example, which is located at /usr/bin/dotnet and therefore should be accessible to every user. Any idea on how to solve this problem? Furthermore, if somebody could explain me why this is happening, I would really appreciate the effort.
Thanks a lot!

Ok, finally after having a big headache, I noticed the error, and it was really simple.
On the tutorial I linked, when creating the system service file, the following line was included: Environment="PATH=/home/myuser/myfolder/enviroment/bin".
Of course, as this was overriding the path, there was no way of executing the commands. Once I notices it I just removed that line, restarted the service, and it was fixed.

mod_wsgi: Reload Code via Inotify - not every N seconds

Up to now I followed this advice to reload the code:
https://code.google.com/archive/p/modwsgi/wikis/ReloadingSourceCode.wiki
This has the drawback, that the code changes get detected only every N second. I could use N=0.1, but this results in useless disk IO.
AFAIK the inotify callback of the linux kernel is available via python.
Is there a faster way to detect code changes and restart the wsgi handler?
We use daemon mode on linux.
Why code reload for mod_wsgi at all
There is interest in why I want this at all. Here is my setup:
Most people use "manage.py runserver" for development and some other wsgi deployment for for production.
In my context we have automated the creation of new systems and prod and development systems are mostly identical.
One operating system (linux) can host N systems (virtual environments).
Developers can use runserver or mod_wsgi. Using runserver has the benefit that it's easy for debugging, mod_wsgi has the benefit that you don't need to start the server first.
mod_wsgi has the benefit, that you know the URL: https://dev-server/system-name/myurl/
With runserver you don't know the port. Use case: You want to link from an internal wiki to a dev-system ....
A dirty hack to get code reload for mod_wsgi, which we used in the past: maximum-requests=1 but this is slow.

Preliminaries.
Developers can use runserver or mod_wsgi. Using runserver has the
benefit that you it easy for debugging, mod_wsgi has the benefit that
you don't need to start the server first.
But you do, the server needs to be setup first and that takes a lot of effort. And the server needs to be started here as well though you can configure it to start automatically at boot.
If you are running on port 80 or 443 which is usually the case, the server can be started only by the root. If it needs to be restarted you will have to ask the super user's help again. So ./manage.py runserver scores heavily here.
mod_wsgi has the benefit, that you know the URL:
https://dev-server/system-name/myurl/
Which is no different from the dev server. By default it starts on port 8000 so you can access it as http://dev-server:8000/system-name/myurl/. If you wanted to use SSL with the development server you can use a package such as django-sslserver or you can put nginx in front of django development server.
With runserver you don't know the port. Use case: You want to link from >an internal wiki to a dev-system ....
With runserver, the port is well defined as mentioned above. And you can make it listen on a different port for exapmle with:
./manage.py runserver 0.0.0.0:9090
Note that if you put development server behind apache (as a reverse proxy) or NGINX, restarting problems etc that I have mentioned above do not apply here.
So in short, for development work, what ever you do with mod_wsgi can be done with the django development server (aka ./manage.py runserver).
Inotify
Here we are getting to the main topic at last. Assuming you have installed inotify-tools you could type this into your shell. You don't need to write a script.
while inotifywait -r -e modify .; do sudo kill -2 yourpid ; done
This will result in the code being reloaded when ...
... using daemon mode with a single process you can send a SIGINT
signal to the daemon process using the ‘kill’ command, or have the
application send the signal to itself when a specific URL is
triggered.
ref: http://modwsgi.readthedocs.io/en/develop/user-guides/frequently-asked-questions.html#application-reloading
alternatively
while inotifywait -r -e modify .; do touch wsgi.py ; done
when
... using daemon mode, with any number of processes, and the process
reload mechanism of mod_wsgi 2.0 has been enabled, then all you need
to do is touch the WSGI script file, thereby updating its modification
time, and the daemon processes will automatically shutdown and restart
the next time they receive a request.
In both situations we are using the -r flag to tell inotify to monitor subdirectories. That means each time you save a .css or .js file apache will reload. But without the -r flag changes to python code in subfolders will be undetected. To have the best of both worls, remove css, js, images etc with the --exclude directive.
What about when your IDE saves an auto backup file? or vim saves the .swp file? That too will cause a code reload. So you would have to exclude those file types too.
So in short, it's a lot of hard work to reproduce what the django development server does free of charge.

You can use inotify hooktables to run any command you want depending on a i-notify signal (here's my source link: http://terokarvinen.com/2016/when-files-change-take-action-inotify-hookable).
After looking the tables you can just reload the code of apache.
For your specific problem, it should be something like:
inotify-hookable --watch-directories sources/ --recursive --on-modify-command './code_reload.sh'
In the previous link, the command to execute was just a simple touch flask/init.wsgi
So, the whole code (adding ignored files was):
inotify-hookable --watch-directories flask/ --recursive --ignore-paths='flask/init.wsgi' --on-modify-command 'touch flask/init.wsgi'
As stated here: Flask + mod_wsgi automatic reload on source code change, if you have enabled WSGIScriptReloading, you can just touch that file. It will cause the entire code to reload (not just the config file). But, if you prefer, you can set any other script to reload the code.
After googling a bit, it seems to be a pretty standard solution for that problem and I think that you can use it for your application.

Python cgi on apache server

I am new to python cgi programming. I have installed apache 2.2 server on linux mint and I have my html form in var/www folder which is being displayed properly. Action to the form is a cgi script that I've put in the folder /usr/lib/cgi-bin. But on submit, it says "The requested URL /usr/lib/cgi-bin/file.cgi as not found on this server." Does anyone know the fix for this?

What is the name of your Python program? Is it /usr/lib/cgi-bin/file.cgi?
What are the rights of this file? Can it be read by apache? Can it be executed?
Is first line with #!/usr/bin/env python or similar good? Make sure that this file can be run from command line (ie. shebang is good)
Does apache receive request with that file? Look at apache logs, especially error.log and access.log (probably in /var/log/apache)
Make sure you have enabled ExecCGI for /usr/lib/cgi-bin/ directory in Apache configuration. For examples see at: http://opensuse.swerdna.org/suseapache.html. You can also use ScriptAlias directive.

running subprocess.Popen under apache+mod_wsgi is always returning an error with a returncode of -6

I'm hoping someone's seen this -
I'm running django-compressor, taking advantage of the lessc setup to render/compress the less into CSS on the file. It works perfectly when invoked from the development server, but when run underneath apache+mod_wsgi it is consistently returning an error.
To debug this, I have run the exact command that the filter is invoking as the www-data user (which is defined as the wsgi user in the WSGIDaemonProcess directive) and verified that it works correctly, including permissions to read and write the files that it's manipulating.
I have also hacked on the django-compressor code in compressor/filters/base.py on that system, and it seems that ANY command attempting to get invoked is getting a returncode of -6 after the proc.communicate() invocation.
I'm hoping someone's seen this before - or that it rings some bell. It works fine on this machine outside of the apache+mod_wsgi process (i.e. running the process as a dev server) as well. I'm just not clear on what might be blocking the subprocess.Popen() invocations.

Are you using Python 2.7.2 by chance?
That version of Python introduced a bug which cause fork() in sub interpreters to fail:
http://bugs.python.org/issue13156
You will have to force WSGI application to run in main Python interpreter of the process by setting:
WSGIApplicationGroup %{GLOBAL}
If running multiple Django applications you need to ensure that only the one affected has this configuration directive applied to it else you would cause all Django applications to run in one interpreter which isn't possible due to how Django configuration works.

how to run .py file in browser using python webserver

i have running a python webserver by using this simple script -
from http.server import SimpleHTTPRequestHandler as RH
from socketserver import TCPServer
ServerName='localhost'
OnPort=8000
print ("Server is running at Port 8000")
TCPServer((ServerName, OnPort), RH).serve_forever()
it is running good and run my index.html file but it is not run .py file in browser when i type --
http://localhost:8000/myfile.py
it just show my python codes as it is i write in file ,it is not execute the code please help me to run my python file (.py) in browser by using this webserver only ,i don't want to use any framework or another webserver.
Is there any way to make a virtual host in this python server like apache.
if possible please suggest me that how to do this and any configuration file need to be configured or not.
thanx...

The problem is that SimpleHTTPRequestHandler only serves files out of a directory, it does not execute them. You must override the do_GET method if you want it to execute code.

You might want to check out CGIHTTPRequestHandler instead. I very briefly played with it on a linux based system and the CGI criteria was for a file to be executable and have the correct shabang. I'm not sure if this would work on Windows though ( if that is even a relevant concern )
With your example code you would only need to change RH to
import CGIHTTPServer.CGIHTTPRequestHandler as RH
Alternatively the 3rd party library Twisted has a concept of .rpy files that are mostly plain Python logic. http://twistedmatrix.com/trac/wiki/TwistedWeb
Note:
Just verified that the CGIHTTPRequestHandler works. Caveats is that all python files must be in a cgi-bin subdir, they must have a valid shabang, must be executable, and must provide valid CGI output.
Personally having written C++ CGI scripts in the 90's, the CGI route seems like the path to maddness... so check out Twisted, CherryPy, or Django ( those three mostly cover the Python web spectrum of possibilities )

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.