Tornado websocket logging - python

I'm trying to implement websockets using Tornado webserver.
My setup looks as follows:
from tornado.options import options, define, parse_command_line
import django.core.handlers.wsgi
import logging
import tornado.httpserver
import tornado.ioloop
import tornado.web
import tornado.wsgi
from pogows.tornado_sockets import GetSocketHandler, UpdateSocketHandler
from mobile.cleaner import start_cleaning
define('port', type=int, default=8080)
tornado.options.options['log_file_prefix'].set('/var/www/pogo_django/logs/tornado_server.log')
tornado.options.parse_command_line()
<snip>
def main():
logger = logging.getLogger(__name__)
wsgi_app = tornado.wsgi.WSGIContainer(
django.core.handlers.wsgi.WSGIHandler())
tornado_app = tornado.web.Application(
[
('/hello-tornado', HelloHandler),
('/socket/get', GetSocketHandler),
('/socket/update', UpdateSocketHandler),
('.*', tornado.web.FallbackHandler, dict(fallback=wsgi_app)),
], debug=True)
logger.info("Tornado POGO server starting...")
server = tornado.httpserver.HTTPServer(tornado_app)
server.listen(options.port)
start_cleaning()
tornado.ioloop.IOLoop.instance().start()
So far everything looks fine, tornado logs, I see the info message.
Now, I'm trying to log some stuff from websocket handler classes.
class GetSocketHandler(tornado.websocket.WebSocketHandler):
def open(self):
print "opening"
def on_closed(self):
print "closing"
def on_message(self, message):
last_update=datetime.datetime.utcnow().replace(tzinfo=utc)
try:
print "getting_user"
...
Tornado is governed by supervisord, with the following configuration:
[program:pogo_tornado] command=/var/www/pogo_django/tornado_server.py
user=www-data stdout_logfile=/var/www/pogo_django/logs/pogo_stdout.log
stderr_logfile=/var/www/pogo_django/logs/pogo_stderr.log
environment=PYTHONPATH="/var/www/pogo_django/",DJANGO_SETTINGS_MODULE="pogo.settings"
I tried a few things.
Just use print statements, as you see from the above snippet, hoping for supervisord to catch it and send to stdout/stderr logs.
Create a separate logging.getLogger() instance inside the websocket class and use that.
None of it produces desired results.
When I run tornado from commandline by hand, I do see the print version printed to console, but logging doesn't work anyway.
Where do I go wrong?

Bah, I got it. I was using getLogger() without setting logging level and just blindly logging to DEBUG.
Explicitly using logger.setLevel(logging.DEBUG) showed me my messages in the logs.
Apparently Tornado sets some other level by defaults.. Stupid me.

Related

Put AsyncHTTPClient or other awaitable in Tornado's get method wiil create ThreadPoolExcutor automatically

How can I prevent a Tornado server from creating ThreadPoolExector automatically.
env:
windows 10
python 3.7
Tornado 6.0.2
import tornado.ioloop
import tornado.web
from tornado.httpclient import HTTPRequest, AsyncHTTPClient
class TestHandler(tornado.web.RequestHandler):
WRITE_MP3_BUFFER_SIZE = 4096
async def get(self):
try:
http_client = AsyncHTTPClient()
req = HTTPRequest(
url='https://www.google.com',
method='GET')
response = await http_client.fetch(req)
contents = response.body.decode('utf-8')
self.write(contents)
except Exception as e:
self.write(str(e))
if __name__ == "__main__":
app = tornado.web.Application([
tornado.web.url(r"/", TestHandler),
])
app.listen(5000)
print("Service Started")
tornado.ioloop.IOLoop.current().start()
I debug this code in VS Code and query from http://127.0.0.1:5000 by Chrome, When I set up breakpoints in Vs Code at debugging, I found that a ThreadPoolExectutor emerged at call stack every query, will it increase unlimitedly and shutdown?
This ThreadPoolExecutor is used for DNS requests, and comes from the standard library's asyncio module. It has a limited size, so it will stop growing at some point (the limit depends on your version of python). You can control this with asyncio's set_default_executor method, but I wouldn't worry about it.

Use zerorpc inside Flask app throws error "operation would block forever"

I have a RPC Server using zerorpc in Python, written this way
import zerorpc
from service import Service
print('RPC server - loading')
def main():
print('RPC server - main')
s = zerorpc.Server(Service())
s.bind("tcp://*:4242")
s.run()
if __name__ == "__main__" : main()
It works fine when I create a client
import zerorpc, sys
client_rpc = zerorpc.Client()
client_rpc.connect("tcp://127.0.0.1:4242")
name = sys.argv[1] if len(sys.argv) > 1 else "dude"
print(client_rpc.videos('138cd9e5-3c4c-488a-9b6f-49907b55a040.webm'))
and runs it. The print() outputs what this 'videos' function returns.
But when I try to use it this same code inside route from a Flask app, I receive the following error:
File "src/gevent/__greenlet_primitives.pxd", line 35, in
gevent.__greenlet_primitives._greenlet_switch
gevent.exceptions.LoopExit: This operation would block forever Hub:
The flask method/excerpt
import zerorpc, sys
client_rpc = zerorpc.Client()
client_rpc.connect("tcp://127.0.0.1:4242")
#app.route('/videos', methods=['POST'])
def videos():
global client_rpc
client_rpc.videos('138cd9e5-3c4c-488a-9b6f-49907b55a040.webm')
I can't find out what might be happening. I'm quite new to Python and I understand that this may have some relation with Flask and how it handles the thread, but I can't figure out how to solve it.
zerorpc depends on gevent, which provides async IO with cooperative coroutines. This means your flask application must use gevent for all IO operations.
In your specific case, you are likely starting your application with a standard blocking IO WSGI server.
Here is a snippet using the WSGI server from gevent:
import zerorpc
from gevent.pywsgi import WSGIServer
app = Flask(__name__)
client_rpc = zerorpc.Client()
client_rpc.connect("tcp://127.0.0.1:4242")
#app.route('/videos', methods=['POST'])
def videos():
global client_rpc
client_rpc.videos('138cd9e5-3c4c-488a-9b6f-49907b55a040.webm')
# ...
if __name__ == "__main__":
http = WSGIServer(('', 5000), app)
http.serve_forever()
Excerpt from https://sdiehl.github.io/gevent-tutorial/#chat-server

Logging failure with multiprocessing

I am trying to implement logging with multiprocessing for our application(flask). We use python2.7, I am using the concept of queues to keep log requests from all the forks and logging records present in the queue. I followed this approach. Only change from that link is I am using TimedRotatatingFileHandler instead of RotatingFileHandler. This is my dictconfig
I am initializing the logger before initializing the forks and in code in the following way
from flask import Flask
from tornado.wsgi import WSGIContainer
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop
path = 'share/log_test/logging.yaml'
if os.path.exists(path):
with open(path, 'rt') as f:
config = yaml.load(f.read())
logging.config.dictConfig(config)
logger = logging.getLogger('debuglog') # problem starts if i keep this statement
app = Flask(__name__)
init_routes(app) # initialize our routes
server_conf = config_manager.load_config(key='server')
logger.info("Logging is set up.") # only this line gets logged and other log statement to be logged by forks in code with same logger are not writing to the file.
http_server = HTTPServer(WSGIContainer(app))
http_server.bind(server_conf.get("PORT")) # port to listen
http_server.start(server_conf.get("FORKS")) # number of forks
IOLoop.current().start()
The problem I am facing is if i use getLogger in the code before initializing the forks, the forks are not writing logs to the logfile, only log statements before initializing forks are being logged. If I remove the logging.getLogger('debuglog') , forks are logging correctly.
I paused the execution flow and verified if the handler is assigned to logger or not but that seems to be fine
Why this strange behavior is observed?
Update: when I use another logger with the same file to write and everything is working fine. But when i use same logger it's not working. Anything related to RLock?
I got a workaround for this solution finally. I removed the concept of queues in the implementation and just printing then and there itself after receiving the log record.
def emit(self, record):
try:
s = self._format_record(record)
self._handler.emit(record) #emitting here itself
# self.send(s) #stopped sending it to queue
except (KeyboardInterrupt, SystemExit):
raise
except:
self.handleError(record)
which seems to be working fine with the following testing
8 workers - 200 requests - 50 concurrency

concurrent connections in tornado

I have a server running on tornado. I have a page that opens a websocket to the same server. Now I have observed that opening multiple instances of this page makes all of them wait except one. Only after that one has finished its websocket, does another one start. Is this normal tornado behaviour of I'm doing something wrong?
Earlier my server was running with django but I migrated to tornado for the websocket support. For that I use fallback server as django.
#!/usr/bin/env python
# Run this with
# PYTHONPATH=. DJANGO_SETTINGS_MODULE=testsite.settings testsite/tornado_main.py
# Serves by default at
# http://localhost:8080/hello-tornado and
# http://localhost:8080/hello-django
from tornado.options import options, define, parse_command_line
import django.core.handlers.wsgi
import tornado.httpserver
import tornado.ioloop
import tornado.web
import tornado.wsgi
define('port', type=int, default=8000)
class HelloHandler(tornado.web.RequestHandler):
def get(self):
self.write('Hello from tornado')
def main():
wsgi_app = tornado.wsgi.WSGIContainer(
django.core.handlers.wsgi.WSGIHandler())
tornado_app = tornado.web.Application(
[
('/hello-tornado', HelloHandler),
('.*', tornado.web.FallbackHandler, dict(fallback=wsgi_app)),
])
server = tornado.httpserver.HTTPServer(tornado_app)
server.listen(options.port)
tornado.ioloop.IOLoop.instance().start()
if __name__ == '__main__':
main()
Can I do something that can allow me to make multiple connections?
You need to look into the Asych facilities in tornado to get this to work properly. Tornado in it's normal state is a single threaded stack and thus you can only handle one connection at a time.
You can use the normal #asynchronous decorator or use their gen library to allow your code to handle multiple connections.
Decorator: http://www.tornadoweb.org/documentation/web.html#decorators
Gen: http://www.tornadoweb.org/documentation/gen.html
Read the documentation carefully if you choose to use the #asynchronous decorator as you need to close the connection when you are done with it.
Yes, this is normal Tornado behaviour in case when you trying run heavy blocking applications like Django in it.
You, definitely, should run django and tornado in separate OS processes. Especially if you use Django ORM.
Need I describe why?

Python. Tornado. Non-blocking xmlrpc client

Basically we can call xmlrpc handlers following way:
import xmlrpclib
s = xmlrpclib.ServerProxy('http://remote_host/rpc/')
print s.system.listmethods()
In tornado we can integrate it like this:
import xmlrpclib
import tornado.web
s = xmlrpclib.ServerProxy('http://remote_host/rpc/')
class MyHandler(tornado.web.RequestHandler):
def get(self):
result = s.system.listmethods()
I have following, a little bit newbie, questions:
Will result = s.system.listmethods() block tornado?
Are there any non-blocking xmlrpc clients around?
How can we achieve result = yield gen.Task(s.system.listmethods)?
1.Yes it will block tornado, since xmlrpclib uses blocking python sockets (as it is)
2.Not that I'm aware of, but I'll provide a solution where you can keep xmlrpclib but have it async
3.My solution doesn't use tornado gen.
Ok, so one useful library to have at mind whenever you're doing networking and need to write async code is gevent, it's a really good high quality library that I would recommend to everyone.
Why is it good and easy to use ?
You can write asynchronous code in a synchronous manner (so that makes it easy)
All you have to do, to do so is monkey patch with one simple line :
from gevent import monkey; monkey.patch_all()
When using tornado you need to know two things (that you may already know) :
Tornado only supports asynchronous views when acting as a HTTPServer (WSGI isn't supported for async views)
Async views need to terminate the responses by themselves you do by using either self.finish() or self.render() (which calls self.finish())
Ok so here's an example illustrating what you would need with the necessary gevent integration with tornado :
# Python immports
import functools
# Tornado imports
import tornado.ioloop
import tornado.web
import tornado.httpserver
# XMLRpc imports
import xmlrpclib
# Asynchronous gevent decorator
def gasync(func):
#tornado.web.asynchronous
#functools.wraps(func)
def f(self, *args, **kwargs):
return gevent.spawn(func, self, *args, **kwargs)
return f
# Our XML RPC service
xml_service = xmlrpclib.ServerProxy('http://remote_host/rpc/')
class MyHandler(tornado.web.RequestHandler):
#gasync
def get(self):
# This doesn't block tornado thanks to gevent
# Which patches all of xmlrpclib's socket calls
# So they no longer are blocking
result = xml_service.system.listmethods()
# Do something here
# Write response to client
self.write('hello')
self.finish()
# Our URL Mappings
handlers = [
(r"/", MyHandler),
]
def main():
# Setup app and HTTP server
application = tornado.web.Application(handlers)
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8000)
# Start ioloop
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
main()
So give the example a try (adapt it to your needs obviously) and you should be good to go.
No need to write any extra code, gevent does all the work of patching up python sockets so they can be used asynchronously while still writing code in a synchronous fashion (which is a real bonus).
Hope this helps :)
I do not think so.
Because Tornado has it's own ioloop, but gevent's ioloop is libevent.
So gevent will block Tornado's ioloop.

Categories