Python Tornado updating shared data between requests

Python Tornado updating shared data between requests - python

I have a Python Tornado app. The app contains request handlers, for which I am passing data to like (the code below is not complete, and is just to illustrate what I want):
configs = {'some_data': 1, # etc.
}
class Application(tornado.web.Application):
def __init__(self):
handlers = [('/pageone', PageOneHandler, configs),
('/pagetwo', PageTwoHandler, configs)]
settings = dict(template_path='/templates',
static_path='/static', debug=False)
tornado.web.Application.__init__(self, handlers, **settings)
# Run the instance
# ... code goes here ...
application = Application()
http_server = tornado.httpserver.HTTPServer(application)
# ... other code (bind to port, etc.)
# Callback function to update configs
some_time_period = 1000 # Once an second
tornado.ioloop.PeriodicCallback(update_configs, some_time_period).start()
tornado.ioloop.IOLoop.instance().start()
I want the update_configs function to update the configs variable defined above and have this change propagate through the handlers. For example (I know this doesn't work):
def update_configs():
configs['some_data'] += 1
# Now, assuming PageOneHandler just prints out configs['some_data'], I'd expect
# the output to be: "1" on the first load, "2" if I load the page a second
# later, "4" if I load the page two seconds after that, etc.
The problem is, the configs variable is passed along to the handlers during creation in the constructor for the Application class. How can I update configs['some_data'] in the periodic callback function?
My actual use case for this mechanism is to refresh the data stored in the configs dictionary from the database every so often.
Is there an easy way to do this without fiddling around with application.handlers (which I have tried for the past hour or so)?

Well, the simplest thing would be to pass the entire config dict to the handlers, rather than just the individual values inside the dict. Because dicts are mutable, any change you make to the values in the dict would then propagate to all the handlers:
import tornado.web
import tornado.httpserver
configs = {'some_data': 1, # etc.
}
def update_configs():
print("updating")
configs['some_data'] += 1
class PageOneHandler(tornado.web.RequestHandler):
def initialize(self, configs):
self.configs = configs
def get(self):
self.write(str(self.configs) + "\n")
class PageTwoHandler(tornado.web.RequestHandler):
def initialize(self, configs):
self.configs = configs
def get(self):
self.write(str(self.configs) + "\n")
class Application(tornado.web.Application):
def __init__(self):
handlers = [('/pageone', PageOneHandler, {'configs' : configs}),
('/pagetwo', PageTwoHandler, {'configs': configs})]
settings = dict(template_path='/templates',
static_path='/static', debug=False)
tornado.web.Application.__init__(self, handlers, **settings)
# Run the instance
application = Application()
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8888)
# Callback function to update configs
some_time_period = 1000 # Once an second
tornado.ioloop.PeriodicCallback(update_configs, some_time_period).start()
tornado.ioloop.IOLoop.instance().start()
Output:
dan#dantop:~> curl localhost:8888/pageone
{'some_data': 2}
dan#dantop:~> curl localhost:8888/pageone
{'some_data': 3}
dan#dantop:~> curl localhost:8888/pagetwo
{'some_data': 4}
dan#dantop:~> curl localhost:8888/pageone
{'some_data': 4}
To me this approach makes the most sense; the data contained in configs doesn't really belong to any one instance of a RequestHandler, it's global state shared by all RequsetHandlers, as well as your PeriodicCallback. So I don't think it makes sense to try to create X numbers of copies of that state, and then try to keep all those different copies in sync manually. Instead, just share the state across your whole process using either a custom object with class variables, or a dict, as shown above.

Another strategy, in addition to what dano mentions above is to attach the shared data to the Application object.
class MyApplication(tornado.web.Application):
def __init__(self):
self.shared_attribute = foo;
handlers = [#your handlers here]
settings = dict(#your application settings here)
super().__init__(handlers, **settings)
server = tornado.httpserver.HTTPServer(MyApplication())
server.listen(8888)
tornado.ioloop.IOLoop.instance().start()
Next you can access shared_attribute defined above in all your request handlers using self.application.shared_attribute.
You update it at one place and it immediately reflects in all your subsequent calls to the request handlers.

Related

Multiple variables (regex groups) on Tornado web application path

I'm switching from Bottle to Tornado. On Bottle I can easily define paths that has multiple variable parts. Like this:
#app.get('/api/applications/<resource>/running_actions/<action_id>')
def get_application_running_action(resource, action_id):
# Return running action(<action_id>) of the application (<resource>)
On Tornado I would like to have something like this:
app = tornado.web.Application([
(r"/api", ApiRootHandler),
(r"/api/applications/(.*)", ApiApplicationHandler),
(r"/api/applications/(.*)/running_actions/(.*)", ApiRunningActionsHandler),
])
Then ApiRunningActionsHandler would search the application and running actions for the application. But on ApiRunningActionsHandler Get() there is only one path parameter. Is there any way to do this on Tornado or do I just need to parse the path again on ApiRunningActionsHandler? Which actually might not even be possible because I want to direct requests to /api/applications/(.*) to another handler.

I figured it out. Main problem was that my regex was catching everything. So
r"/api/applications/(.*)/running_actions/(.*)"
actually results only one group. Thus action_id argument wasn't set.
Second issue was that most descriptive path must be defined first.
This works:
class ApiRootHandler(tornado.web.RequestHandler):
def get(self):
pass
class ApiApplicationHandler(tornado.web.RequestHandler):
def get(self, action_name):
pass
class ApiRunningActionsHandler(tornado.web.RequestHandler):
def get(self, action_name, action_id):
self.write("action_name: " + action_name + ", action_id: " + action_id)
app = tornado.web.Application([
(r"/api/applications/(\w+)/running_actions/([0-9]+)", ApiRunningActionsHandler),
(r"/api/(\w+)", ApiApplicationHandler),
(r"/api/", ApiRootHandler),
])
app.listen(8888)
tornado.ioloop.IOLoop.current().start()

Just make the second argument to ApiApplicationHandler.get optional:
class ApiApplicationHandler(RequestHandler):
def get(self, resource, action_id=None):
pass

Non-lazy instance creation with Pyro4 and instance_mode='single'

My aim is to provide to a web framework access to a Pyro daemon that has time-consuming tasks at the first loading. So far, I have managed to keep in memory (outside of the web app) a single instance of a class that takes care of the time-consuming loading at its initialization. I can also query it with my web app. The code for the daemon is:
Pyro4.expose
#Pyro4.behavior(instance_mode='single')
class Store(object):
def __init__(self):
self._store = ... # the expensive loading
def query_store(self, query):
return ... # Useful query tool to expose to the web framework.
# Not time consuming, provided self._store is
# loaded.
with Pyro4.Daemon() as daemon:
uri = daemon.register(Thing)
with Pyro4.locateNS() as ns:
ns.register('thing', uri)
daemon.requestLoop()
The issue I am having is that although a single instance is created, it is only created at the first proxy query from the web app. This is normal behavior according to the doc, but not what I want, as the first query is still slow because of the initialization of Thing.
How can I make sure the instance is already created as soon as the daemon is started?
I was thinking of creating a proxy instance of Thing in the code of the daemon, but this is tricky because the event loop must be running.
EDIT
It turns out that daemon.register() can accept either a class or an object, which could be a solution. This is however not recommended in the doc (link above) and that feature apparently only exists for backwards compatibility.

Do whatever initialization you need outside of your Pyro code. Cache it somewhere. Use the instance_creator parameter of the #behavior decorator for maximum control over how and when an instance is created. You can even consider pre-creating server instances yourself and retrieving one from a pool if you so desire? Anyway, one possible way to do this is like so:
import Pyro4
def slow_initialization():
print("initializing stuff...")
import time
time.sleep(4)
print("stuff is initialized!")
return {"initialized stuff": 42}
cached_initialized_stuff = slow_initialization()
def instance_creator(cls):
print("(Pyro is asking for a server instance! Creating one!)")
return cls(cached_initialized_stuff)
#Pyro4.behavior(instance_mode="percall", instance_creator=instance_creator)
class Server:
def __init__(self, init_stuff):
self.init_stuff = init_stuff
#Pyro4.expose
def work(self):
print("server: init stuff is:", self.init_stuff)
return self.init_stuff
Pyro4.Daemon.serveSimple({
Server: "test.server"
})
But this complexity is not needed for your scenario, just initialize the thing (that takes a long time) and cache it somewhere. Instead of re-initializing it every time a new server object is created, just refer to the cached pre-initialized result. Something like this;
import Pyro4
def slow_initialization():
print("initializing stuff...")
import time
time.sleep(4)
print("stuff is initialized!")
return {"initialized stuff": 42}
cached_initialized_stuff = slow_initialization()
#Pyro4.behavior(instance_mode="percall")
class Server:
def __init__(self):
self.init_stuff = cached_initialized_stuff
#Pyro4.expose
def work(self):
print("server: init stuff is:", self.init_stuff)
return self.init_stuff
Pyro4.Daemon.serveSimple({
Server: "test.server"
})

How to pass arguments to a page class with web.py?

I have several classes in my program.
The main one called WebServer creates the web.py application itself, and calls to other classes for the webpages. Can I pass self.var1 for example to the search class __init__? Because I thought of just creating a method in the index class like set_var1 or something like that, then I don't know how to access the specific instance of this class the the web application creates.
The class:
import sys
import os
import web
from pages.search import search
from pages.index import index
class WebServer:
def __init__(self):
self.var1 = "test"
self.urls = (
'/', 'index',
'/search', 'search'
)
self.app = web.application(self.urls, globals())
self.app.run()
if __name__ == "__main__":
w = WebServer()

Not really, no. Specific instances of search and index are created by web.py in response to an incoming request. There are better / easier ways.
Also, putting this initialization in a WebServer class, while possible, isn't the common way of doing it with web.py. There's no need for the class to do this: it's a singleton and this file is essentially a startup / configuration file.
To have application-wide information available to your "response" classes (search, index, etc.), make that information either global, or hook it into web.config which is a web.Storage(). For example:
app = web.application(urs, globals())
web.config.update({"var1" : "test"})
app.run()
Which is then available to you responses. For example:
class search(object):
def GET(self):
if web.config.var1 == 'test':
return do_test_search()
return do_regular_search()

How to make spydlay module to work like httplib/http.client?

I have to test server based on Jetty. This server can work with its own protocol, HTTP, HTTPS and lastly it started to support SPDY. I have some stress tests which are based on httplib /http.client -- each thread start with similar URL (some data in query string are variable), adds execution time to global variable and every few seconds shows some statistics. Code looks like:
t_start = time.time()
connection.request("GET", path)
resp = connection.getresponse()
t_stop = time.time()
check_response(resp)
QRY_TIMES.append(t_stop - t_start)
Client working with native protocol shares httplib API, so connection may be native, HTTPConnection or HTTPSConnection.
Now I want to add SPDY test using spdylay module. But its interface is opaque and I don't know how to change its opaqueness into something similar to httplib interface. I have made test client based on example but while 2nd argument to spdylay.urlfetch() is class name and not object I do not know how to use it with my tests. I have already add tests to on_close() method of my class which extends spdylay.BaseSPDYStreamHandler, but it is not compatibile with other tests. If it was instance I would use it outside of spdylay.urlfetch() call.
How can I use spydlay in a code that works based on httplib interfaces?

My only idea is to use global dictionary where url is a key and handler object is a value. It is not ideal because:
new queries with the same url will overwrite previous response
it is easy to forget to free handler from global dictionary
But it works!
import sys
import spdylay
CLIENT_RESULTS = {}
class MyStreamHandler(spdylay.BaseSPDYStreamHandler):
def __init__(self, url, fetcher):
super().__init__(url, fetcher)
self.headers = []
self.whole_data = []
def on_header(self, nv):
self.headers.append(nv)
def on_data(self, data):
self.whole_data.append(data)
def get_response(self, charset='UTF8'):
return (b''.join(self.whole_data)).decode(charset)
def on_close(self, status_code):
CLIENT_RESULTS[self.url] = self
def spdy_simply_get(url):
spdylay.urlfetch(url, MyStreamHandler)
data_handler = CLIENT_RESULTS[url]
result = data_handler.get_response()
del CLIENT_RESULTS[url]
return result
if __name__ == '__main__':
if '--test' in sys.argv:
spdy_response = spdy_simply_get('https://localhost:8443/test_spdy/get_ver_xml.hdb')
I hope somebody can do spdy_simply_get(url) better.

Does web.py session / processor work when yield is used in handlers?

I have the following two handlers for a web.py setup:
class count1:
def GET(self):
s.session.count += 1
return str(s.session.count)
class count2:
def GET(self):
s.session.count += 1
yield str(s.session.count)
The app runs on web.py shipped cherrypy (app.run()) or gevent server.
urls = (
"/count1", "count.count1",
"/count2", "count.count2",
)
session = web.session.Session(app, web.session.DiskStore('sessions'), initializer={'count': 0})
s.session = session
app = web.application(urls, locals())
print "Main: setting count to 1"
from gevent.wsgi import WSGIServer
if __name__ == "__main__":
usecherrypy = False
if usecherrypy:
app.run()
else: # gevent wsgiserver
wsgifunc = app.wsgifunc()
server = WSGIServer(('0.0.0.0', 8080), wsgifunc, log=None)
server.serve_forever()
Session works fine in count1 case but not always in count2. In the first time a page of /count2 is loaded the counter is increased once, but refreshing after that doesn't increase the counter in session i.e. the update to session is never saved. What would be wrong here?
Webpy installed from pypi or latest from github behaves the same in this case.
After digging into the code, the actual reason seems to be that, when the handler using yield, it is only being called to return the generator object, and then are returned from all enclosing processors (e.g. Session._processor which calls _save in the finally block). Web.py makes sure that the generator is completely unrolled before returning the data to the client, but the unroll process is after all processors which is completely different behavior comparing to normal function handlers.
So the question is as: is there any fixes, or workarounds (apart from calling Session._save manually) to this?
Thanks in advance for any answers!

Maybe it happens because yield returns a generator and not a value.
Refs:
http://od-eon.com/blogs/calvin/python-yield-versus-return/
What does the "yield" keyword do in Python?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Tornado updating shared data between requests - python

Related

Multiple variables (regex groups) on Tornado web application path

Non-lazy instance creation with Pyro4 and instance_mode='single'

How to pass arguments to a page class with web.py?

How to make spydlay module to work like httplib/http.client?

Does web.py session / processor work when yield is used in handlers?

Categories

Resources