I've stumbled upon pexpect and my impression is that it looks roughly similar to fabric. I've tried to find some comparison, without success, so I'm asking here--in case someone has experience with both tools.
Is my impression (that they are roughly equivalent) correct, or it's just how it looks on the surface ?
I've used both. Fabric is more high level than pexpect, and IMHO a lot better. It depends what you're using it for, but if your use is deployment and configuration of software then Fabric is the right way to go.
You can also combine them, to have the best of both worlds, fabrics remoting capabilities and pexpects handling of prompts. Have a look at these answers: https://stackoverflow.com/a/10007635/708221 and https://stackoverflow.com/a/9614913/708221
There are different use cases for both. Something that pexpect does that Fabric doesn't is preserving state. Each Fabric api command (eg: run/sudo) is it's own individual command. So if you do:
run("cd project_dir && workon project")
run("make")
This won't be in that directory nor will it be in the virtualenv. While there are context managers for cd() in Fabric now, they're more or less prepending each run with a cd.
In the scheme of things this has little bearing on how the majority of projects work, and is essentially unnoticed. For some needs however you might use pexpect to manage this state, for multiple sudos or some sort of interactive task that can't be automated with flags.
All of this though isn't a demerit for Fabric, as being only python, you're more than able to include pexpect code inside fabric tasks.
Though in all other ways, Fabric essentially manages all the hard work of remote connections and running commands better than you'd get writing code from the ground up with pexpect.
Update I've been informed of a project that works with Fabric and pexepect, you can see more on this question's answer
Related
I'm looking to set up a distributed system where there are compute/worker machines running resource-heavy Python 3 code, and there is a single web server that serves the results of the Python computation to clients. I would very much like to write the web server in Node.js.
I've looked into using an RPC framework—specifically, this question lead me to ZeroRPC, but it's not compatible with Python 3 (the main issue is that it requires gevent, which isn't that close to a Python 3 version yet). There doesn't seem to be another viable option for Python–Node.js RPC as far as I can tell.
In light of that, I'm open to using something other than RPC, especially since I've read that the RPC strategy hides too much from the programmer.
I'm also open to using a different language for the web server if that really makes more sense; for example, it may be much simpler from a development point of view to just use Python for the server too.
How can I achieve this?
You have a few options here.
First, it sounds like you like ZeroRPC, and your only problem is that it depends on gevent, which is not 3.x-ready yet.
Well, gevent is close to 3.x-ready. There are a few forks of it that people are testing and even using, which you can see on issue #38. As of mid-September 2014 the one that seems to be getting the most traction is Michal Mazurek's. If you're lucky, you can just do this:
pip3 install git+https://github.com/MichalMazurek/gevent
pip3 install ZeroRPC
Or, if ZeroRPC has metadata that says it's Python 2-only, you can install it from its repo the same way as gevent.
The down side is that none of the gevent-3.x forks are quite battle-tested yet, which is why none of them have been accepted upstream and released yet. But if you're not in a huge hurry, and willing to take a risk, there's a pretty good chance you can start with a fork today, and switch to the final version when it's released, hopefully before you've reached 1.0 yourself.
Second, ZeroRPC is certainly not the only RPC library available for either Python or Node. And most of them have a similar kind of interface for exposing methods over RPC. And, while you may ultimately need something like ZeroMQ for scalability or deployment reasons, you can probably use something simpler and more widespread like JSON-RPC over HTTP—which has a dozen or more Python and Node implementations—for early development, then switch to ZeroRPC later.
Third, RPC isn't exactly complicated, and binding methods to RPCs the way most libraries do isn't that hard. Making it asynchronous can be tricky, but again, for early development you can just use an easy but nonscalable solution—creating a thread for each request—and switch to something else later. (Of course that solution is only easy if your service is stateless; otherwise you're just eliminating all of your async problems and replacing them with race condition problems…)
ZeroMQ offers several transport classes, while the first two will be best suited for the case of a heterogenous RPC layer
ipc://
tcp://
pgm://
epgm://
inproc://
ZeroMQ has ports for both systems, so will definitely serve also your projected needs.
I currently have my own set of wrappers around Paramiko, a few functions that log the output as a command gets executed, reboot some server, transfer files, etc. However, I feel like I'm reinventing the wheel and this should already exist somewhere.
I've looked into Fabric and it partially provides this, but its execution model would force me to rewrite a big part of my code, especially because it shares information about the hosts in a global variables and doesn't seem to be originally intended to be used as a library.
Preferably, each server should be represented by an object, so I could save state about it and run commands using something like server.run("uname -a"), provide some basic tools like rebooting, checking for connectivity, transferring files and ideally even give me some simple way to run a command on a subset of servers in parallel.
Is there already some library that provides this?
Look at Ansible: 'minimal ssh command and control'. From their description: 'Ansible is a radically simple configuration-management, deployment, task-execution, and multinode orchestration framework'.
Fabric 2.0 (currently in development) will probably be similar to what you have in mind.
Have a look at https://github.com/fabric/fabric/tree/v2
I am attempting to build an educational coding site, similar to Codecademy, but I am frankly at a loss as to what steps should be taken. Could I be pointed in the right direction in including even a simple python interpreter in a webapp?
One option might be to use PyPy to create a sandboxed python. It would limit the external operations someone could do.
Once you have that set up, your website would take the code source, send it over ajax to your webserver, and the server would run the code in a subprocess of a sandboxed python instance. You would also be able to kill the process if it took longer than say 5 seconds. Then you return the output back as a response to the client.
See these links for help on a PyPy sandbox:
http://doc.pypy.org/en/latest/sandbox.html
http://readevalprint.com/blog/python-sandbox-with-pypy.html
To create a fully interactive REPL would be even more involved. You would need to keep an interpreter alive to each client on your server. Then accept ajax "lines" of input and run them through the interp by communicating with the running process, and return the output.
Overall, not trivial. You would need some strong dev skills to do this comfortably. You may find this task a bit daunting if you are just learning.
There's more to do here than you think.
The major problem is that you cannot let people run arbitrary Python code on your webserver. For example, what happens if they do
import os
os.system("rm -rf *.*")
So clearly you have to run this Python code securely. But then you have the problem of securing Python, which is basically impossible because of how dynamic it is. And so you'll probably have to run the Python shell in a virtual machine, which comes with its own headaches.
Have you seen e.g. http://code.google.com/p/google-app-engine-samples/downloads/detail?name=shell_20091112.tar.gz&can=2&q=?
One recent option for this is to use repl.
This option is awesome because the compilers are made using JavaScript so the compilation and execution is made in the user-side, meaning that the server is free of vulnerabilities.
They have compilers for: Python3, Python, Javascript, Java, Ruby, PHP...
I strongly recommend you to check their site at http://repl.it
Look into LXC Containers. They have a pretty cool api that you can use to create lightweight linux containers. You could run the subprocess commands inside that container that way the end user could not mess with your main server.
I'm looking for a tool to keep track of "what's running where". We have a bunch of servers, and on each of those a bunch of projects. These projects may be running on a specific version (hg tag/commit nr) and have their requirements at specific versions as well.
Fabric looks like a great start to do the actual deployments by automating the ssh part. However, once a deployment is done there is no overview of what was done.
Before reinventing the wheel I'd like to check here on SO as well (I did my best w/ Google but could be looking for the wrong keywords). Is there any such tool already?
(In practice I'm deploying Django projects, but I'm not sure that's relevant for the question; anything that keeps track of pip/virtualenv installs or server state in general should be fine)
many thanks,
Klaas
==========
EDIT FOR TEMP. SOLUTION
==========
For now, we've chosen to simply store this information in a simple key-value store (in our case: the filesystem) that we take great care to back up (in our case: using a DCVS). We keep track of this store with the same deployment tool that we use to do the actual deploys (in our case: fabric)
Passwords are stored inside a TrueCrypt volume that's stored inside our key-value store.
==========
I will still gladly accept any answer when some kind of Open Source solution to this problem pops up somewhere. I might share (part of) our solution somewhere myself in the near future.
pip freeze gives you a listing of all installed packages. Bonus: if you redirect the output to a file, you can use it as part of your deployment process to install all those packages (pip can programmatically install all packages from the file).
I see you're already using virtualenv. Good. You can run pip freeze -E myvirtualenv > myproject.reqs to generate a dependency file that doubles as a status report of the Python environment.
Perhaps you want something like Opscode Chef.
In their own words:
Chef works by allowing you to write
recipes that describe how you want a
part of your server (such as Apache,
MySQL, or Hadoop) to be configured.
These recipes describe a series of
resources that should be in a
particular state - for example,
packages that should be installed,
services that should be running, or
files that should be written. We then
make sure that each resource is
properly configured, only taking
corrective action when it's
neccessary. The result is a safe,
flexible mechanism for making sure
your servers are always running
exactly how you want them to be.
EDIT: Note Chef is not a Python tool, it is a general purpose tool, written in Ruby (it seems). But it is capable of supporting various "cookbooks", including one for installing/maintaining Python apps.
Does anyone know of a working and well documented implementation of a daemon using python? Please post a link here if you know of a project that fits these two requirements.
Three options I can think of-
Make a cron job that calls your script. Cron is a common name for a GNU/Linux daemon that periodically launches scripts according to a schedule you set. You add your script into a crontab or place a symlink to it into a special directory and the daemon handles the job of launching it in the background. You can read more at wikipedia. There is a variety of different cron daemons, but your GNU/Linux system should have it already installed.
Pythonic approach (a library, for example) for your script to be able to daemonize itself. Yes, it will require a simple event loop (where your events are timer triggering, possibly, provided by sleep function). Here is the one I recommend & use - A simple unix/linux daemon in Python
Use python multiprocessing module. The nitty-gritty of trying to fork a process etc. are hidden in this implementation. It's pretty neat.
I wouldn't recommend 2 or 3 'coz you're in fact repeating cron functionality. The Linux system paradigm is to let multiple simple tools interact and solve your problems. Unless there are additional reasons why you should make a daemon (in addition to trigger periodically), choose the other approach.
Also, if you use daemonize with a loop and a crash happens, make sure that you have logs which will help you debug. Also devise a way so that the script starts again. While if the script is added as a cron job, it will trigger again in the time gap you kept.
If you just want to run a daemon, consider Supervisor, a daemon that itself controls and manages daemons.
If you want to look at the nitty-gritty, you can check out Supervisor's launch script or some of the responses to this lazyweb request.
Check this link for a double-fork daemon: http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/
The code is readable and well-documented. You want to take a look at chapter 13 of W. Richard's book 'Advanced Programming in the UNix Environment' for detailed information on Unix daemons.