Running multiple sites from a single Python web framework [duplicate] - python

This question already has answers here:
Multiple sites on Django
(2 answers)
Closed 1 year ago.
I know you can do redirection based on the domain or path to rewrite the URI to point at a site-specific location and I've also seen some brutish if and elif statements for every site as shown in the following code, which I would like to avoid.
if site == 'site1':
...
elif site == 'site2:
...
What are some good and clever ways of running multiple sites from a single, common Python web framework (i.e., Pylons, TurboGears, etc)?

Django has this built in. See the sites framework.
As a general technique, include a 'host' column in your database schema attached to the data you want to be host-specific, then include the Host HTTP header in the query when you are retrieving data.

Using Django on apache with mod_python, I host multiple (unrelated) django sites simply with the following apache config:
<VirtualHost 1.2.3.4>
DocumentRoot /www/site1
ServerName site1.com
<Location />
SetHandler python-program
SetEnv DJANGO_SETTINGS_MODULE site1.settings
PythonPath "['/www'] + sys.path"
PythonDebug On
PythonInterpreter site1
</Location>
</VirtualHost>
<VirtualHost 1.2.3.4>
DocumentRoot /www/site2
ServerName site2.com
<Location />
SetHandler python-program
SetEnv DJANGO_SETTINGS_MODULE site2.settings
PythonPath "['/www'] + sys.path"
PythonDebug On
PythonInterpreter site2
</Location>
</VirtualHost>
No need for multiple apache instances or proxy servers. Using a different PythonInterpreter directive for each site (the name you enter is arbitrary) keeps the namespaces separate.

I use CherryPy as my web server (which comes bundled with Turbogears), and I simply run multiple instances of the CherryPy web server on different ports bound to localhost. Then I configure Apache with mod_proxy and mod_rewrite to transparently forward requests to the proper port based on the HTTP request.

Using multiple server instances on local ports is a good idea, but you don't need a full featured web server to redirect HTTP requests.
I would use pound as a reverse proxy to do the job. It is small, fast, simple and does exactly what we need here.
WHAT POUND IS:
a reverse-proxy: it passes requests from client browsers to one or more back-end servers.
a load balancer: it will distribute the requests from the client browsers among several back-end servers, while keeping session information.
an SSL wrapper: Pound will decrypt HTTPS requests from client browsers and pass them as plain HTTP to the back-end servers.
an HTTP/HTTPS sanitizer: Pound will verify requests for correctness and accept only well-formed ones.
a fail over-server: should a back-end server fail, Pound will take note of the fact and stop passing requests to it until it recovers.
a request redirector: requests may be distributed among servers according to the requested URL.

Related

Weird `/` behavior when Serving Flask with Apache2 + Gunicorn

I'm trying to build multiple endpoints and subendpoints within my application, part of it as a learning exercise, and part of it is that I have 2 domains.
For simplicity I'm going to refer to them as domain1 and domain2.
My Flask listening endpoints are on /api1 and /api2 for domains 1 & 2 respectively. Gunicorn is bound to listen on a unix socket at sock/domain1.sock and sock/domain2.sock. So far everything is working this way.
My Apache2 proxies the endpoints into the proper socket as follows:
for domain1 I have:
<Location /api>
ProxyPass unix:/var/www/socks/domain1.sock|http://127.0.0.1/api1
ProxyPassReverse unix:/var/www/socks/domain1.sock|http://127.0.0.1/api1
</Location>
for domain2 I have:
<Location /api>
ProxyPass unix:/var/www/socks/domain2.sock|http://127.0.0.1/api2
ProxyPassReverse unix:/var/www/socks/domain2.sock|http://127.0.0.1/api2
</Location>
I know that I don't need to have 2 sockets, but I'm doing so just for testing.
Now when I open domain1.com/api things are working perfectly. And so are for domain2.com/api
But when I open domain1.com/api/ (with a slash at the end) or domain2.com/api/ then it gives me a Site Not Found error. This is understandable since in my Flask I'm listening to the endpoint without a trailing slash. The fix for that is to implement / into my flask endpoint. So when I do that, the weird behavior occurs.
New Flask listening Endpoints are /api1/ and /api2/ (with trailing slash).
Now when I open domain.com/api/ it is working as intended. But when I'm on domain.com/api (without the slash) it's referring me to either domain.comapi or 127.0.01/api, where both are wrong scenarios. I tried to add a trailing slash in my Apache config, and tried multiple Flask approaches but they're all doing the same weird behavior and I can't understand why it's doing that. Now personally I don't mind using the endpoint without the slash, I just want to understand why this is happening. I also tried googling a lot but nothing came up related to my query.
Reproduceable Behavior:
I'm unable to link the 2nd domain as it is a protected IP for my company, so I created multiple endpoints so that you can click on to simulate the behavior.
https://thethiny.xyz/api1 -> sock|http://127.0.0.1/api1 -> internal /api1
https://thethiny.xyz/api2 -> sock|http://127.0.0.1/api2 -> internal /api2/
https://thethiny.xyz/api3 -> sock|http://127.0.0.1/api1/ -> internal /api1
https://thethiny.xyz/api4 -> sock|http://127.0.0.1/api2/ -> internal /api2
Working:
https://thethiny.xyz/api1
https://thethiny.xyz/api2/
https://thethiny.xyz/api4
Not Found:
https://thethiny.xyz/api1/
https://thethiny.xyz/api3
https://thethiny.xyz/api3/
Weird Redirect:
https://thethiny.xyz/api2
https://thethiny.xyz/api4/
Edit: I understand the problem and have came up with some solutions in the answer below. I'm not satisfied with the solutions but I'm taking this as a limitation of mapping endpoints to different underlying endpoints. For more information, read about Reverse Proxy Pass and Redirects and Rewriting Location Header in HTTPd
I now understand the problem. So in my Apache Proxy it is giving the request to Flask on the endpoint specified 127.0.0.1/api2, so when there's a redirect request from within Flask, it tries to redirect to 127.0.0.1/api2/, since Flask doesn't have any information about the original url source. Using ProxyPreserveHost solves this only when the endpoint resources match, as in mapping /api2/ to /api2 but not /api4/ to /api2/, since on the redirect, Flask receives a request for /api2 -> /api2/ and returns that having no information about /api4. Unfortunately I don't there's an actual solution to this from Apache2/Flask configurations other than manually handling the routes specifically to how you want them to be, as in do not allow Flask to redirect automatically since it will not know how, and instead either manually redirect (external redirect) to the correct endpoint, or handle each route separately (/api and /api/stuff but not /api/).
Example:
app.add_url_rule("/api2", view_func=StubFunction(), redirect_to="/api2/")
app.add_url_rule("/api2/", view_func=ActualFunction())
And add ProxyPreserveHost On to your Apache2 config or use the built in Proxy Fixer if you don't want to modify your Virtual Hosts:
from werkzeug.middleware.proxy_fix import ProxyFix
app.wsgi_app = ProxyFix(app.wsgi_app, x_proto=1, x_host=1)
What happens now is that 127.0.0.1 gets translated to yourdomain.tld when delivered to your Flask app. So when you're redirecting back using redirect_to, you're redirecting to your domain externally, no longer relatively. So in the case above, /api2 is redirecting to myDomain.tld/api2/ then /api2/ is called, which is functional.
You can also skip the preserve host and manually put in your domain name in the redirect as so:
app.add_url_rule("/api2", view_func=StubFunction(), redirect_to="https://yourDomain.tld/api2/")
But I don't like this approach in case you change your domain for some reason.
tl;dr, don't put a trailing slash in your ProxyPass Applications.

HTTP response 403. Is the problem with my Apache config or my Python enoding of the URL?

When I access
http://my_site.com/api/my_project/submitSearch.php?skills=C+OR+%28C%2B%2B+AND+UML%29
I get an HTTP response of 403.
The point being that I am encoding skill=C OR (C++ AND UML) in Python using urllib.parse.quote_plus().
If I use skill=(C++ AND UML), then there is no problem.
http://my_site.com/api/my_project/submitSearch.php?skills=%28C%2B%2B+AND+UML%29
I can only assume that the URL is triggering some Apache config rule. I asked my ISP and their solution was to allow all access from my current IP address. BUT, I want to allow everyone to access this URL, so how can I configure my Apache to allow this?
OR, am I wrongly encoding my URL in Python? Strangely, when I use encodeURIComponent() in JavaScript, the server does not reject the request.
So, the JS/Python encodings are
Python: http://localhost/api/enigma/submitSearch.php?skills=C+OR+%28C%2B%2B+AND+UML%29
JS:        http://localhost/api/enigma/submitSearch.php?skills=C%20OR%20(C%2B%2B%20AND%20UML)
Also, the problem is only at my ISP, not on localhost
You could try adding an .htaccess file to the directory (or even root directory of your website) containing the content in your Apache server in the relevant directory. This should grant access to anyone that tries to access the dir.
Allow From All
Satisfy Any
AllowOverride
You can even wrap these directives like so:
<Directory "/path/to/mysubdirectory">
Allow from All
Satisfy Any
AllowOverride
</Directory>
to make it apply to a given dir.
I'm not familiar with Apache, so I can't help you there.
To get the same output as JavaScript's encodeURIComponent, try using urllib.parse.urlencode with urllib.parse.quote (instead of quote_plus).
query = {"skills": "C OR (C++ AND UML)"}
query_string = urllib.parse.urlencode(query, safe="-_.!~*'()", quote_via=urllib.parse.quote)
print(query_string)
skills=C%20OR%20(C%2B%2B%20AND%20UML)

Multi-process Flask Application on Apache

I am trying to create a simple web application with Python3/Flask and serve it on Apache. I could not figure out how can I make my application to respond multiple requests.
This is my wsgi file:
import sys
import os
sys.path.insert(0, '/var/www/html/FlaskDeploy')
from apps import app as application
This code excerpt from httpd.conf file:
<VirtualHost *:80>
DocumentRoot /var/www/html/FlaskDeploy
WSGIScriptAlias / /var/www/html/FlaskDeploy/app.wsgi
WSGIDaemonProcess apps threads=1 python-path=/var/www/html/FlaskDeploy/env/bin:/var/www/html/FlaskDeploy/env/lib/python3.6/site-packages
<Directory /var/www/html/FlaskDeploy>
WSGIScriptReloading On
WSGIProcessGroup apps
WSGIApplicationGroup %{GLOBAL}
Order deny,allow
Allow from all
</Directory>
</VirtualHost>
Everything works fine but the application runs the requests one by one. For example, assume that each user performs a heavy database operation that takes 3 minutes. In this case, when 3 users from different locations open the application at the same time, the last one has to wait for 9 minutes (including the others' operations to be completed).
Basically I want to create a web application that is able to handle multiple requests.
I am coming from NodeJS world and I have never encountered with this problem on NodeJS. It runs on a single thread but can handle multiple requests.
It is only capable of only handling one request at a time, because that is what you told mod_wsgi to do in using:
threads=1
Don't set that option and it will instead default to 15 threads in the daemon process group and so that is how many requests it can handle concurrently.
If your requests are I/O bound that should be fine to begin with and you can tune things later. If your requests are more CPU bound than I/O bound, start to introduce additional processes as well and distribute requests across them.
processes=3 threads=5
Even if heavily I/O bound, do not increase threads too far per process, it is better to still spread them across processes as Python doesn't work as well with high number of threads per process.
For more information read the documentation:
http://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html

Python library for a SOAP server that handles multiple requests simultaneously?

I'm looking for a python library for easily creating a server which exposes web services (SOAP), and can process multiple requests simultaneously.
I've tried using ZSI and rcplib, but with no success.
Update:
Thanks for your answers. Both ZSI and rcplib (the successor of soaplib) implement their own Http server. How do I integrate ZSI/rcplib with the libraries you mentioned?
Update2:
After some tweaking, I managed to install and run this on linux, and it seems to work well.
Then I installed it on windows, after a lot of ugly tweakings, and then I stubmled upon the fact that WSGIDaemonProcess isn't supported in windows (also mentioned in mod_wsgi docs). I tried to run it anyway, and it does seems to work on each request asynchronicly, but I'm not sure it will work well under pressure.
Thanks anyway...
Hello World example of rpclib
Please check this from rpclib example
# File /home/myhome/test.wsgi
import logging
from rpclib.application import Application
from rpclib.decorator import srpc
from rpclib.interface.wsdl import Wsdl11
from rpclib.protocol.soap import Soap11
from rpclib.service import ServiceBase
from rpclib.model.complex import Iterable
from rpclib.model.primitive import Integer
from rpclib.model.primitive import String
from rpclib.server.wsgi import WsgiApplication
class HelloWorldService(ServiceBase):
#srpc(String, Integer, _returns=Iterable(String))
def say_hello(name, times):
'''
Docstrings for service methods appear as documentation in the wsdl
<b>what fun</b>
#param name the name to say hello to
#param the number of times to say hello
#return the completed array
'''
for i in xrange(times):
yield 'Hello, %s' % name
application = WsgiApplication(Application([HelloWorldService], 'rpclib.examples.hello.soap',
interface=Wsdl11(), in_protocol=Soap11(), out_protocol=Soap11()))
Also change your apache config as
WSGIDaemonProcess example processes=5 threads=5
WSGIProcessGroup example
WSGIScriptAlias / /home/myhome/test.wsgi
<Directory /home/myhome/>
Order deny,allow
Allow from all
</Directory>
As per your requirement you can change the processes and threads.
Excuse me, may be I didn't understand you right.
I think that you want your server to process HTTP requests in parallel, but then you don't need to think about your code/library. Parallelizing should be done by Apache httpd and mod_wsgi/mod_python module.
Just set up httpd.conf with 'MaxClients 100' for example and 'WSGIDaemonProcess webservice processes=1 threads=100' for example.
You can use soaplib to develop your soap service. To expose that service to other you can use Apache and mod_wsgi module. To set it multithreading or multiprocessing you can set the parameter in mod_wsgi

Processing chunked encoded HTTP POST requests in python (or generic CGI under apache)

I have a j2me client that would post some chunked encoded data to a webserver. I'd like to process the data in python. The script is being run as a CGI one, but apparently apache will refuse a chunked encoded post request to a CGI script. As far as I could see mod_python, WSGI and FastCGI are no go too.
I'd like to know if there is a way to have a python script process this kind of input. I'm open to any suggestion (e.g. a confoguration setting in apache2 that would assemble the chunks, a standalone python server that would do the same, etc.) I did quite a bit of googling and didn't find anything usable, which is quite strange.
I know that resorting to java on the server side would be a solution, but I just can't imagine that this can't be solved with apache + python.
I had the exact same problem a year ago with a J2ME client talking to a Python/Ruby backend. The only solution I found which doesn't require application or infrastructure level changes was to use a relatively unknown feature of mod_proxy.
Mod_proxy has the ability to buffer incoming (chunked) requests, and then rewrite them as a single request with a Content-Length header before passing them on to a proxy backend. The neat trick is that you can create a tiny proxy configuration which passes the request back to the same Apache server. i.e. Take an incoming chunked request on port 80, "dechunk" it, and then pass it on to your non-HTTP 1.1 compliant server on port 81.
I used this configuration in production for a little over a year with no problems. It looks a little something like this:
ProxyRequests Off
<Proxy http://example.com:81>
Order deny,allow
Allow from all
</Proxy>
<VirtualHost *:80>
SetEnv proxy-sendcl 1
ProxyPass / http://example.com:81/
ProxyPassReverse / http://example.com:81/
ProxyPreserveHost On
ProxyVia Full
<Directory proxy:*>
Order deny,allow
Allow from all
</Directory>
</VirtualHost>
Listen 81
<VirtualHost *:81>
ServerName example.com
# Your Python application configuration goes here
</VirtualHost>
I've also got a full writeup of the problem and my solution detailed on my blog.
I'd say use the twisted framework for building your http listener.
Twisted supports chunked encoding.
http://python.net/crew/mwh/apidocs/twisted.web.http._ChunkedTransferEncoding.html
Hope this helps.
Apache 2.2 mod_cgi works fine for me, Apache transparently unchunks the request as it is passed to the CGI application.
WSGI currently disallows chunked requests, and mod_wsgi does indeed block them with a 411 response. It's on the drawing board for WSGI 2.0. But congratulations on finding something that does chunk requests, I've never seen one before!
You can't do what you want with mod_python. You can do it with mod_wsgi if you are using version 3.0. You do however have to step outside of the WSGI 1.0 specification as WSGI effectively prohibits chunked request content.
Search for WSGIChunkedRequest in http://code.google.com/p/modwsgi/wiki/ChangesInVersion0300 for what is required.
Maybe it is a configuration issue? Django can be fronted with Apache by mod_python, WSGI and FastCGI and it can accept file uploads.

Categories