Restrict Python system calls to virtualenv or directory

Restrict Python system calls to virtualenv or directory - python

Is there any simple way to restrict Python system calls (os.system, subprocess, ...) to a given folder/tree?
A possible use case would be, a shared webserver, where the users/students can upload their i.e. Bottle APPs to run via wsgi/uwsgi and nginx or so.
In order to simplify the configuration, all webapps run under the same system user (i.e. www-data) and store their data under /var/www/webapp_name.
But what if some "smart" user includes some function on his app, which tries to make a system call to read or modify something into another location of the system?
A possible solution could be, to create separate system users for each webapp, and tighten the permissions. But they could still do plenty of potential damage. And it would mean some extra configuration overhead, compared to just web-users with no system privileges.
If virtualenv would somehow allow something like
os.system('ls ./')
but block something like
os.system('ls /')
or
os.system('rm -rf ../another_webapp')
it could be really useful.
This could probably be done by something like SELinux or Apparmor too, but it would be cleaner to have a pure pythonic solution.

Related

How to properly locally 'deploy' a Python based server application for development?

A pretty large Python based project I'm working on has to deal with a situation some of you might know:
you have a local checkout which your server can not be run from (history), you alter a couple of files, e.g. by editing or git-operations and then you want to locally 'patch' a running server residing at a different location of the file system.
[Local Checkout, e.g. /home/me/project] = deploy => [Running Environment, e.g. /opt/project]
The 'deployment' process might have to run arbitrary build scripts, copy modified files, maybe restart a running service and so on.
Note that I'm not talking about CI or web-deployment - it's more like you change something on your source files and want to know if it runs (locally).
Currently we do this with a self-grown hierarchy scripts and want to improve this approach, e.g. with a make-based approach.
Personally I dislike make for Python projects for a couple of reasons, but in principle the thing I'm looking for could be done with make, i.e. it detects modifications, knows dependencies and it can do arbitrary stuff to meet the dependencies.
I'm now wondering if there isn't something like make for Python projects with same basic features as make but with 'Python-awareness' (Python binding, nice handling of command line args, etc).
Has this kind of 'deploy my site for development'-process a name I should know? I'm not asking what program I should use but how I should inform myself (examples are very welcome though)

How to "redirect" filesystem read/write calls without root and performance degradation?

I have non-root access to a server that is shared by many users. I first develop and run some code locally, and then I want to rsync my data to a temporary location on a remote server and run my code on a remote server without changing any file paths.
I want to transparently hijack filesystem reads and writes and redirect them to different folders, like, if I run
redirect /home/a /home/b/remote-home/a python code.py
and then code tries to read from /home/a/a.txt, it should get content of /home/remote-home/a/a.txt, and same with writes.
I am particularly interested in doing this for a python process if that is necessary. I use a lot of third-party libraries that do file IO, so just mocking builtins.open is not an option. That IO is pretty intensive (reading and writing gigabytes of data), so performance degradation that exceeds something like 200-300% is an issue.
Options that I am aware of are:
redefining read,read64, write, etc. calls with a LD_PRELOAD that would call real functions with different paths under the hood
same with ptrace
unshare and remount parts of the filesystem, but userspace namespacse are disabled in my particular case for whatever security reasons
First two options seem not very reliable (and ptrace must be slow), unless there is some fairly stable piece of code that does exactly that so I could be sure that I did not make any obvious buffer overflow errors there. Containers like docker are not an options because they are not installed on the remote server. Unless, of course, there are some userspace containers that do not rely on linux namespaces under the hood.
UPD: not a full answer, but singularity manages to provide such functionality without giving everyone root privileges.

Small library for remote command execution similar to Fabric

I currently have my own set of wrappers around Paramiko, a few functions that log the output as a command gets executed, reboot some server, transfer files, etc. However, I feel like I'm reinventing the wheel and this should already exist somewhere.
I've looked into Fabric and it partially provides this, but its execution model would force me to rewrite a big part of my code, especially because it shares information about the hosts in a global variables and doesn't seem to be originally intended to be used as a library.
Preferably, each server should be represented by an object, so I could save state about it and run commands using something like server.run("uname -a"), provide some basic tools like rebooting, checking for connectivity, transferring files and ideally even give me some simple way to run a command on a subset of servers in parallel.
Is there already some library that provides this?

Look at Ansible: 'minimal ssh command and control'. From their description: 'Ansible is a radically simple configuration-management, deployment, task-execution, and multinode orchestration framework'.

Fabric 2.0 (currently in development) will probably be similar to what you have in mind.
Have a look at https://github.com/fabric/fabric/tree/v2

fabric vs pexpect

I've stumbled upon pexpect and my impression is that it looks roughly similar to fabric. I've tried to find some comparison, without success, so I'm asking here--in case someone has experience with both tools.
Is my impression (that they are roughly equivalent) correct, or it's just how it looks on the surface ?

I've used both. Fabric is more high level than pexpect, and IMHO a lot better. It depends what you're using it for, but if your use is deployment and configuration of software then Fabric is the right way to go.

You can also combine them, to have the best of both worlds, fabrics remoting capabilities and pexpects handling of prompts. Have a look at these answers: https://stackoverflow.com/a/10007635/708221 and https://stackoverflow.com/a/9614913/708221

There are different use cases for both. Something that pexpect does that Fabric doesn't is preserving state. Each Fabric api command (eg: run/sudo) is it's own individual command. So if you do:
run("cd project_dir && workon project")
run("make")
This won't be in that directory nor will it be in the virtualenv. While there are context managers for cd() in Fabric now, they're more or less prepending each run with a cd.
In the scheme of things this has little bearing on how the majority of projects work, and is essentially unnoticed. For some needs however you might use pexpect to manage this state, for multiple sudos or some sort of interactive task that can't be automated with flags.
All of this though isn't a demerit for Fabric, as being only python, you're more than able to include pexpect code inside fabric tasks.
Though in all other ways, Fabric essentially manages all the hard work of remote connections and running commands better than you'd get writing code from the ground up with pexpect.
Update I've been informed of a project that works with Fabric and pexepect, you can see more on this question's answer

How to deploy highly iterative updates

I have a set of binary assets (swf files) each about 150Kb in size. I am developing them locally on my home computer and I want to periodically deploy them for review. My current strategy is:
Copy the .swf's into a transfer directory that is also a hg (mercurial) repo.
hg push the changes to my slicehost VPN
ssh onto my slicehost VPN
cd to my transfer directory and hg up
su www and cp the changed files into my public folder for viewing.
I would like to automate the process. Best case scenario is something close to:
Copy the .swf's into a "quick deploy" directory
Run a single local script to do all of the above.
I am interested in:
advice on where to put passwords since I need to su www to transfer files into the public web directories.
how the division of responsibility between local machine and server is handled.
I think using rsync is a better tool than hg since I don't really need a revision history of these types of changes. I can write this as a python script, a shell script or however is considered a best practice.
Eventually I would like to build this into a system that can handle my modest deployment needs. Perhaps there is an open-source deployment system that handles this and other types of situations? I'll probably roll-my-own for this current need but long term I'd like something relatively flexible.
Note: My home development computer is OS X and the target server is some recent flavour of Ubuntu. I'd prefer a python based solution but if this is best handled from the shell I have no problems putting it together that way.

to avoid su www I see two easy choices.
make a folder writable to you and readable by www's group in some path that the web-server will be able to serve, then you can rsync to that folder from somewhere on your local machine.
put your public ssh key in www's authorized_keys and rsync to the www user (a bit less security in some setups perhaps, but not much, and usually more convenient).
working around su www by putting your or its password in some file would seem far less secure.
A script to invoke "rsync -avz --partial /some/path www#server:some/other/path" should be quick to write in python (although I do not python well).

If you're at all comfortable in Python, I recommend Fabric for automated deployment scripts.
In addition to group permissions or ssh-ing as www (with key-based auth), a third solution to the permissions issue would be to add your user to /etc/sudoers and use sudo (you can specify the exact command your user is allowed to use sudo for, so you can make the security implications minimal).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.