Bind arbitrary Python objects to CherryPy sessions - python

I'm using CherryPy to make a web-based frontend for SymPy that uses an asynchronous process library on the server side to allow for processing multiple requests at once without waiting for each one to complete. So as to allow for the frontend to function as expected, I am using one process for the entirety of each session. The client-side Javascript sends the session-id from the cookie to the server when the user submits a request, and the server-side currently uses a pair of lists, storing instances of a controller class in one and the corresponding session-id's in another, creating a new interpreter proxy and sending the input if a non-existant session-id is submitted. The only problem with this is that the proxy classes are not deleted upon the expiration of their corresponding sessions. Also, I can't see anything to retrieve the session-id for which the current request is being served.
My questions about all this are: is there any way to "connect" an arbitrary object to a CherryPy session so that it gets deleted upon session expiration, is there something I am overlooking here that would greatly simplify things, and does CherryPy's multi-threading negate the problem of synchronous reading of the stdout filehandle from the child process?

You can create your own session type, derived from CherryPy's base session. Use its clean_up method to do your cleanup.
Look at cherrypy/lib/sessions.py for details and sample session implementations.

Related

Flask/Python: from flask import request

I read the Flask doc, it said whenever you need to access the GET variables in the URL, you can just import the request object in your current python file?
My question here is that if two user are hitting the same Flask app with the same URL and GET variable, how does Flask differentiate the request objects? Can someone tell me want is under the hood?
From the docs:
In addition to the request object there is also a second object called
session which allows you to store information specific to a user from
one request to the next. This is implemented on top of cookies for you
and signs the cookies cryptographically. What this means is that the
user could look at the contents of your cookie but not modify it,
unless they know the secret key used for signing.
Means every user is associated with a flask session object which distinguishes them from eachother.
Just wanted to highlight one more fact about the requests object.
As per the documentation, it is kind of proxy to objects that are local to a specific context.
Imagine the context being the handling thread. A request comes in and the web server decides to spawn a new thread (or something else, the underlying object is capable of dealing with concurrency systems other than threads). When Flask starts its internal request handling it figures out that the current thread is the active context and binds the current application and the WSGI environments to that context (thread). It does that in an intelligent way so that one application can invoke another application without breaking.

What are good ways to create a HTTP session with simultaneous requests?

I have a problem and can't solve it. Maybe I'm making it too hard or complex or I'm just going in the wrong direction and thinking of things that don't make sense. Below is a description of what happens. (Multiple tabs opened in a browser or a page that requests some other pages at the same time for example.)
I have a situation where 3 requests are received by the web application simultaneously and new user session has to be created. This session is used to store notification, XSRF token and login information when the user logs in. The application uses threads to handle requests (CherryPy under Bottle.py).
The 3 threads (or processes in case or multiple application instances) start handling the 3 requests. They check the cookie, no session exists, and create a new unique token that is stored in a cookie and in Redis. This will all happen at the same time and they don't know if a session already has been created by another thread, because all 3 tokens are unique.
These unused sessions will expire eventually, but it's not neat. It means everytime a client simultaneously does N requests and a new session needs to be created, N-1 session are useless.
If there is a property that can be used to identify a client, like an IP address, it would be a lot easier, but an IP address is not safe to use in this case. This property can be used to atomically store a session in Redis and other requests would just pick up that session.
If this is through a browser and is using cookies then this shouldn't be an issue at all. The cookie will, from what I can tell, the last session value that it is set to. If the client you are using does not use cookies then of course it will open a new session for each connection.

python reverse proxy spawning via cgi

I need to write a cgi page which will act like a reverse proxy between the user and another page (mbean). The issue is that each mbean uses different port and I do not know ahead of time which port user will want to hit.
Therefore want I need to do is following:
A) Give user a page which will allow him to choose which application he wants to hit
B) spawn a reverse proxy base on information above (which gives me port, server, etc..)
C) the user connects to the remote mbean page via the reverse proxy and therefore never "leaves" the original page.
The reason for C is that user does not have direct access to any of the internal apps only has access to initial port 80.
I looked at twisted and it appears to me like it can do the job. What I don't know is how to spawn twisted process from within cgi so that it can establish the connection and keep further connection within the reverse proxy framework.
BTW I am not married to twisted, if there is another tool that would do the job better, I am all ears. I can't do things like mod_proxy (for instance) since the wide range of ports would make configuration rather silly (at around 1000 different proxy settings).
You don't need to spawn another process, that would complicate things a lot. Here's how I would do it based on something similar in my current project :
Create a WSGI application, which can live behind a web server.
Create a request handler (or "view") that is accessible from any URL mapping as long as the user doesn't have a session ID cookie.
In the request handler, the user can choose the target application and with it, the hostname, port number, etc. This request handler creates a connection to the target application, for example using httplib and assigns a session ID to it. It sets the session ID cookie and redirects the user back to the same page.
Now when your user hits the application, you can use the already open http connection to redirect the query. Note that WSGI supports passing back an open file-like object as response, including those provided by httplib, for increased performance.

Python Webserver: How to serve requests asynchronously

I need to create a python middleware that will do the following:
a) Accept http get/post requests from multiple clients.
b) Modify and Dispatch these requests to a backend remote application (via socket communication). I do not have any control over this remote application.
c) Receive processed results from backend application and return these results back to the requesting clients.
Now the clients are expecting a synchronous request/response scenario. But the backend application is not returning the results synchronously. That is, some requests take much longer to process than others. Hence,
Client 1 : send http request C1 --> get response R1
Client 2 : send http request C2 --> get response R2
Client 3 : send http request C3 --> get response R3
Python middleware receives them in some order: C2, C3, C1. Dispatches them in this order to backend (as non-http messages). Backend responds with results in mixed order R1, R3, R2. Python middleware should package these responses back into http response objects and send the response back to the relevant client.
Is there any sample code to program this sort of behavior. There seem to be something like 20 different web frameworks for python and I'm confused as to which one would be best for this scenario (would prefer something as lightweight as possible ... I would consider Django too heavy ... I tried bottle, but I am not sure how to go about programming that for this scenario).
================================================
Update (based on discussions below): Requests have a request id. Responses have a response id (which should match the request id that they correspond to). There is only one socket connection between the middleware and the remote backend application. While we can maintain a {request_id : ip_address} dictionary, the issue is how to construct a HTTP response object to the correct client. I assume, threading might solve this problem where each thread maintains its own response object.
Screw frameworks. This exactly the kind of task for asyncore. This module allows event-based network programming: given a set of sockets, it calls back given handlers when data is ready on any of them. That way, threads are not necessary just to dumbly wait for data on one socket to arrive and painfully pass it to another thread. You would have to implement the http handling yourself, but examples can be found on that. Alternatively, you could use the async feature of uwsgi, which would allow your application to be integrated with an existing webserver, but that does not integrate with asyncore by default --- though it wouldn't be hard to make it work. Depends on specific needs.
Quoting your comment:
The middleware uses a single persistent socket connection to the backend. All requests from middleware are forwarded via this single socket. Clients do send a request id along with their requests. Response id should match the request id. So the question remains: How does the middleware (web server) keep track of which request id belonged to which client? I mean, is there any way for a cgi script in middleware to create a db of tuples like and once a response id matches, then send a http response to clientip:clienttcpport ?
Is there any special reason for doing all this processing in a middleware? You should be able to do all this in a decorator, or somewhere else, if more appropriate.
Anyway, you need to maintain a global concurrent dictionary (extend dict and protect it using threading.Lock). Upon a new request, store the given request-id as key, and associate it to the respective client (sender). Whenever your backend responds, retrieve the client from this dictionary, and remove the entry so it doesn't accumulate forever.
UPDATE: someone already extended the dictionary for you - check this answer.
Ultimately your going from the synchronous http request-response protocol from your clients to an asynchronous queuing/messaging protocol with your backend. So you've two choices (1) either make requests wait until the backend has no outstanding work, then process one (2) write something that marries the backend responses with their associated request (using a dictionary of request or something)
One way might be to run your server in one thread while dealing with your backend in another (see... Run Python HTTPServer in Background and Continue Script Execution) or maybe look at aiohttp (https://docs.aiohttp.org/en/v0.12.0/web.html)

Request-Aware Code in Google App Engine -- os.environ?

In GAE, you can say users.get_current_user() to get the currently logged-in user implicit to the current request. This works even if multiple requests are being processed simultaneously -- the users module is somehow aware of which request the get_current_user function is being called on behalf of. I took a look into the code of the module in the development server, and it seems to be using os.environ to get the user email and other values associated to the current request.
Does this mean that every request gets an independent os.environ object?
I need to implement a service similar to users.get_current_user() that would return different values depending on the request being handled by the calling code. Assuming os.environ is the way to go, how do I know which variable names are already being used (or reserved) by GAE?
Also, is there a way to add a hook (or event handler) that gets called before every request?
As the docs say,
A Python web app interacts with the
App Engine web server using the CGI
protocol.
This basically means exactly one request is being served at one time within any given process (although, differently from real CGI, one process can be serially reused for multiple requests, one after the other, if it defines main functions in the various modules to which app.yaml dispatches). See also this page, and this one for documentation of the environment variables CGI defines and uses.
The hooks App Engine defines are around calls at the RPC layer, not the HTTP requests. To intercept each request before it gets served, you could use app.yaml to redirect all requests to a single .py file and perform your interception in that file's main function before redirecting (or, you could call your hook at the start of the main in every module you're using app.yaml to dispatch to).

Categories