We have a collection of Unix scripts (and/or Python modules) that each perform a long running task. I would like to provide a web interface for them that does the following:
Asks for relevant data to pass into scripts.
Allows for starting/stopping/killing them.
Allows for monitoring the progress and/or other information provided by the scripts.
Possibly some kind of logging (although the scripts already do logging).
I do know how to write a server that does this (e.g. by using Python's built-in HTTP server/JSON), but doing this properly is non-trivial and I do not want to reinvent the wheel.
Are there any existing solutions that allow for maintaining asynchronous server-side tasks?
Django is great for writing web applications, and the subprocess module (subprocess.Popen en .communicate()) is great for executing shell scripts. You can give it a stdin,stdout and stderr stream for communication if you want.
Answering my own question, I recently saw the announcement of Celery 1.0, which seems to do much of what I am looking for.
I would use SGE, but I think it could be overkill for your need...
Related
I have a running CLI application in Python that uses threads to execute some workers. Now I am writing a GUI using electron for this application. For simple requests/responses I am using gRPC to communicate between the Python application and the GUI.
I am, however, struggling to find a proper publishing mechanism to push data to the GUI: gRPCs integrated streaming won't work since it uses generators; as already mentioned my longer, blocking tasks are executed using threads (subclasses of threading.Thread). Also I'd like to emit certain events (e.g., the progress) from within those threads.
Then I've found the Flasks SocketIO implementation, which is, however, a blocking execution and thus not really suited for what I have in mind - I'd have to again execute two processes (Flask and my CLI application)...
Another package I've found is websockets but I can't get my head around how I could implement this producer() function that they mention in the patterns.
My last idea would be to deploy a broker-based message system like Redis or simply fall back to the brokerless zmq, which is a bit of a hassle to setup for the GUI application.
So the simple question:
Is there any easy framework that allows to create a server-"task" in a Python that I can pass messages to publish to?
For anyone struggling with concurrency in python:
No, there isn't any simple framework. IMHO pythons' concurrency handling is a bit of a mess (compared to other languages like golang, where concurrency is built in). There's multiple major packages implementing this, one of them asyncio, but most of them are incompatible. I've ended up using a similar solution like proposed in this question.
I'm not super familiar with asyncio, but I was hoping there would be some easy way to use asyncio.Queue to push log messages to a Queue instead of writing them to the disk, and then have a worker on a thread wait for these Queue events and write them to disk when resources are available. This seems pretty widely necessary, as logging is a huge bottleneck in a lot of code but isn't always needed in real time. Are there any pre-existing packages for this or can anyone with more experience write a short example script to get me started? NOTE: This needs to interface with existing code, so making it all packaged in a class would probably be preferred.
It's handled in the standard library in recent Python versions. See this post for information, and the official documentation. This functionality predates asyncio, and so doesn't use it (and doesn't especially need to).
For Python 2.7, you can use the logutils package which provides equivalent functionality.
I am attempting to build an educational coding site, similar to Codecademy, but I am frankly at a loss as to what steps should be taken. Could I be pointed in the right direction in including even a simple python interpreter in a webapp?
One option might be to use PyPy to create a sandboxed python. It would limit the external operations someone could do.
Once you have that set up, your website would take the code source, send it over ajax to your webserver, and the server would run the code in a subprocess of a sandboxed python instance. You would also be able to kill the process if it took longer than say 5 seconds. Then you return the output back as a response to the client.
See these links for help on a PyPy sandbox:
http://doc.pypy.org/en/latest/sandbox.html
http://readevalprint.com/blog/python-sandbox-with-pypy.html
To create a fully interactive REPL would be even more involved. You would need to keep an interpreter alive to each client on your server. Then accept ajax "lines" of input and run them through the interp by communicating with the running process, and return the output.
Overall, not trivial. You would need some strong dev skills to do this comfortably. You may find this task a bit daunting if you are just learning.
There's more to do here than you think.
The major problem is that you cannot let people run arbitrary Python code on your webserver. For example, what happens if they do
import os
os.system("rm -rf *.*")
So clearly you have to run this Python code securely. But then you have the problem of securing Python, which is basically impossible because of how dynamic it is. And so you'll probably have to run the Python shell in a virtual machine, which comes with its own headaches.
Have you seen e.g. http://code.google.com/p/google-app-engine-samples/downloads/detail?name=shell_20091112.tar.gz&can=2&q=?
One recent option for this is to use repl.
This option is awesome because the compilers are made using JavaScript so the compilation and execution is made in the user-side, meaning that the server is free of vulnerabilities.
They have compilers for: Python3, Python, Javascript, Java, Ruby, PHP...
I strongly recommend you to check their site at http://repl.it
Look into LXC Containers. They have a pretty cool api that you can use to create lightweight linux containers. You could run the subprocess commands inside that container that way the end user could not mess with your main server.
Does anyone know of a working and well documented implementation of a daemon using python? Please post a link here if you know of a project that fits these two requirements.
Three options I can think of-
Make a cron job that calls your script. Cron is a common name for a GNU/Linux daemon that periodically launches scripts according to a schedule you set. You add your script into a crontab or place a symlink to it into a special directory and the daemon handles the job of launching it in the background. You can read more at wikipedia. There is a variety of different cron daemons, but your GNU/Linux system should have it already installed.
Pythonic approach (a library, for example) for your script to be able to daemonize itself. Yes, it will require a simple event loop (where your events are timer triggering, possibly, provided by sleep function). Here is the one I recommend & use - A simple unix/linux daemon in Python
Use python multiprocessing module. The nitty-gritty of trying to fork a process etc. are hidden in this implementation. It's pretty neat.
I wouldn't recommend 2 or 3 'coz you're in fact repeating cron functionality. The Linux system paradigm is to let multiple simple tools interact and solve your problems. Unless there are additional reasons why you should make a daemon (in addition to trigger periodically), choose the other approach.
Also, if you use daemonize with a loop and a crash happens, make sure that you have logs which will help you debug. Also devise a way so that the script starts again. While if the script is added as a cron job, it will trigger again in the time gap you kept.
If you just want to run a daemon, consider Supervisor, a daemon that itself controls and manages daemons.
If you want to look at the nitty-gritty, you can check out Supervisor's launch script or some of the responses to this lazyweb request.
Check this link for a double-fork daemon: http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/
The code is readable and well-documented. You want to take a look at chapter 13 of W. Richard's book 'Advanced Programming in the UNix Environment' for detailed information on Unix daemons.
To give a little background, I'm writing (or am going to write) a daemon in Python for scheduling tasks to run at user-specified dates. The scheduler daemon also needs to have a JSON-based HTTP web service interface (buzzword mania, I know) for adding tasks to the queue and monitoring the scheduler's status. The interface needs to receive requests while the daemon is running, so they either need to run in a separate thread or cooperatively multitask somehow. Ideally the web service interface should run in the same process as the daemon, too.
I could think of a few ways to do it, but I'm wondering if there's some obvious module out there that's specifically tailored for this kind of thing. Any suggestions about what to use, or about the project in general are quite welcome. Thanks! :)
Check out the class BaseHTTPServer -- a "Basic HTTP server" bundled with Python.
http://docs.python.org/library/basehttpserver.html
You can spin up a second thread and have it serve your requests for you very easily (probably < 30 lines of code). And it all runs in the same process and Python interpreter space, so it can access all your objects, etc.
I'm not sure I understand your question properly, but take a look at Twisted
I believed all kinds of python web framework is useful.
You can pick up one like CherryPy, which is small enough to integrate into your system. Also CherryPy includes a pure python WSGI server for production.
Also the performance may not be as good as apache, but it's already very stable.
Don't re-invent the bicycle!
Run jobs via cron script, and create a separate web interface using, for example, Django or Tornado.
Connect them via a database. Even sqlite will do the job if you don't want to scale on more machines.