Is distributing python source code in Docker secure?

Is distributing python source code in Docker secure? - python

I am about to decide on programming language for the project.
The requirements are that some of customers want to run application on isolated servers without external internet access.
To do that I need to distribute application to them and cannot use SaaS approach running on, for example, my cloud (what I'd prefer to do...).
The problem is that if I decide to use Python for developing this, I would need to provide customer with easy readable code which is not really what I'd like to do (of course, I know about all that "do you really need to protect your source code" kind of questions but it's out of scope for now).
One of my colleagues told me about Docker. I can find dozen of answers about Docker container security. Problem is all that is about protecting (isolating) host from code running in container.
What I need is to know if the Python source code in the Docker Image and running in Docker Container is secured from access - can user in some way (doesn't need to be easy) access that Python code?
I know I can't protect everything, I know it is possible to decompile/crack everything. I just want to know the answer just to decide whether the way to access my code inside Docker is hard enough that I can take the risk.

Docker images are an open and documented "application packaging" format. There are countless ways to inspect the image contents, including all of the python source code shipped inside of them.
Running applications inside of a container provides isolation from the application escaping the container to access the host. They do not protect you from users on the host inspecting what is occurring inside of the container.

Python programs are distributed as source code. If it can run on a client machine, then the code is readable on that machine. A docker container only contains the application and its libraries, external binaries and files, not a full OS. As the security can only be managed at OS level (or through encryption) and as the OS is under client control, the client can read any file on the docker container, including your Python source.
If you really want to go that way, you should consider providing a full Virtual Machine to your client. In that case, the VM contains a full OS with its account based security (administrative account passwords on the VM can be different from those of the host). Is is far from still waters, because it means that the client will be enable to setup or adapt networking on the VM among other problems...
And you should be aware the the client security officer could emit a strong NO when it comes to running a non controlled VM on their network. I would never accept it.
Anyway, as the client has full access to the VM, really securing it will be hard if ever possible (disable booting from an additional device may even not be possible). It is admitted in security that if the attacker has physical access, you have lost.
TL/DR: It in not the expected answer but just don't. It you sell your solution you will have a legal contract with your customer, and that kind of problem should be handled at a legal level, not a technical one. You can try, and I have even given you a hint, but IMHO the risks are higher than the gain.

I know that´s been more than 3 years, but... looking for the same kind of solution I think that including compiled python code -not your source code- inside the container would be a challenging trial for someone trying to access your valuable source code.
If you run pyinstaller --onefile yourscript.py you will get a compiled single file that can be run as an executable. I have only tested it in Raspberry, but as far as I know it´s the same for, say, Windows.
Of course anything can be reverse engineered, but hopefully it won´t be worth the effort to the regular end user.

I think it could be a solution as using a "container" to protect our code from the person we wouldn't let them access. the problem is docker is not a secure container. As the root of the host machine has the most powerful control of the Docker container, we don't have any method to protect the root from accessing inside of the container.
I just have some ideas about a secure container:
Build a container with init file like docker file, a password must be set when the container is created;
once the container is built, we have to use a password to access inside, including
reading\copy\modify files
all the files stored on the host machine should be encypt。
no "retrieve password" or “--skip-grant-” mode is offered. that means nobody can
access the data inside the container if u lost the password.
If we have a trustable container where we can run tomcat or Django server, code obfuscation will not be necessary.

Related

Can a website's controlling Python code be viewed?

I am trying to place a simple Flask app within a Docker container to be hosted on Firebase as per David East's article on https://medium.com/firebase-developers/hosting-flask-servers-on-firebase-from-scratch-c97cfb204579
Within the app, I have used Flask email to send emails automatically. Is it safe to leave the password as a string in the Python code?

It's extremely unsafe. The password shouldn't be in the code at all. Rotate the password immediately if you're concerned it might be compromised.
There are two important details about Docker that matter here. The first is that it's very easy to get content out of an image, especially if it's in an interpreted language like Python; an interested party can almost certainly docker run --rm -it --entrypoint sh your-image to get an interactive shell to poke around, and it's impossible to prevent this. The other is that it's basically trivial to use Docker to root the host – docker run --rm -it -v /:/host busybox sh can read and write any host file as root, including the internal Docker storage – and so there is a fairly high level of trust involved.
Including passwords in code at all is usually a mistake, and it's something most security scans will flag. If it's included in your code then it's probably checked into source control unencrypted, which also is a security issue. It being embedded in the code also probably makes it harder to change since the system operator won't have access to the code.
In a Docker context, often the best way to pass a credential is through a docker run -e environment variable; your Python code would see it in the os.environ dictionary. Passing it via a file that is not checked in to source control is arguably more secure, but also more complex, and I don't think the security gain is significant.

python copying directory and reading text files Remotely

I'm about to start working on a project where a Python script is able to remote into a Windows Server and read a bunch of text files in a certain directory. I was planning on using a module called WMI as that is the only way I have been able to successfully remotely access a windows server using Python, But upon further research I'm not sure i am going to be using this module.
The only problem is that, these text files are constantly updating about every 2 seconds and I'm afraid that the script will crash if it comes into an MutEx error where it tries to open the file while it is being rewritten. The only thing I can think of is creating a new directory, copying all the files (via script) into this directory in the state that they are in and reading them from there; and just constantly overwriting these ones with the new ones once it finishes checking all of the old ones. Unfortunately I don't know how to execute this correctly, or efficiently.
How can I go about doing this? Which python module would be best for this execution?

There is Windows support in Ansible these days. It uses winrm. There are plenty of Python libraries that utilize winrm, just google it, but Ansible is very versatile.
http://docs.ansible.com/intro_windows.html
https://msdn.microsoft.com/en-us/library/aa384426%28v=vs.85%29.aspx

I've done some work with WMI before (though not from Python) and I would not try to use it for a project like this. As you said WMI tends to be obscure and my experience says such things are hard to support long-term.
I would either work at the Windows API level, or possibly design a service that performs the desired actions access this service as needed. Of course, you will need to install this service on each machine you need to control. Both approaches have merit. The WinAPI approach pretty much guarantees you don't invent any new security holes and is simpler initially. The service approach should make the application faster and required less network traffic. I am sure you can think of others easily.
You still have to have the necessary permissions, network ports, etc. regardless of the approach. E.g., WMI is usually blocked by firewalls and you still run as some NT process.
Sorry, not really an answer as such -- meant as a long comment.
ADDED
Re: API programming, though you have no Windows API experience, I expect you find it familiar for tasks such as you describe, i.e., reading and writing files, scanning directories are nothing unique to Windows. You only need to learn about the parts of the API that interest you.
Once you create the appropriate security contexts and start your client process, there is nothing service-oriented in the, i.e., your can simply open and close files, etc., ignoring that fact that the files are remote, other than server name being included in the UNC name of the file/folder location.

Encrypted and secure docker containers

We all know situations when you cannot go open source and freely distribute software - and I am in one of these situations.
I have an app that consists of a number of binaries (compiled from C sources) and Python code that wraps it all into a system. This app used to work as a cloud solution so users had access to app functions via network but no chance to touch the actual server where binaries and code are stored.
Now we want to deliver the "local" version of our system. The app will be running on PCs that our users will physically own. We know that everything could be broken, but at least want to protect the app from possible copying and reverse-engineering as much as possible.
I know that Docker is a wonderful deployment tool so I wonder: is it possible to create encrypted Docker containers where no one can see any data stored in the container's filesystem? Is there a known solution to this problem?
Also, maybe there are well known solutions not based on Docker?

The root user on the host machine (where the docker daemon runs) has full access to all the processes running on the host. That means the person who controls the host machine can always get access to the RAM of the application as well as the file system. That makes it impossible to hide a key for decrypting the file system or protecting RAM from debugging.
Using obfuscation on a standard Linux box, you can make it harder to read the file system and RAM, but you can't make it impossible or the container cannot run.
If you can control the hardware running the operating system, then you might want to look at the Trusted Platform Module which starts system verification as soon as the system boots. You could then theoretically do things before the root user has access to the system to hide keys and strongly encrypt file systems. Even then, given physical access to the machine, a determined attacker can always get the decrypted data.

What you are asking about is called obfuscation. It has nothing to do with Docker and is a very language-specific problem; for data you can always do whatever mangling you want, but while you can hope to discourage the attacker it will never be secure. Even state-of-the-art encryption schemes can't help since the program (which you provide) has to contain the key.
C is usually hard enough to reverse engineer, for Python you can try pyobfuscate and similar.
For data, I found this question (keywords: encrypting files game).

If you want a completely secure solution, you're searching for the 'holy grail' of confidentiality: homomorphous encryption. In short, you want to encrypt your application and data, send them to a PC, and have this PC run them without its owner, OS, or anyone else being able to scoop at the data.
Doing so without a massive performance penalty is an active research project. There has been at least one project having managed this, but it still has limitations:
It's windows-only
The CPU has access to the key (ie, you have to trust Intel)
It's optimised for cloud scenarios. If you want to install this to multiple PCs, you need to provide the key in a secure way (ie just go there and type it yourself) to one of the PCs you're going to install your application, and this PC should be able to securely propagate the key to the other PCs.
Andy's suggestion on using the TPM has similar implications to points 2 and 3.

Sounds like Docker is not the right tool, because it was never intended to be used as a full-blown sandbox (at least based on what I've been reading). Why aren't you using a more full-blown VirtualBox approach? At least then you're able to lock up the virtual machine behind logins (as much as a physical installation on someone else's computer can be locked up) and run it isolated, encrypted filesystems and the whole nine yards.
You can either go lightweight and open, or fat and closed. I don't know that there's a "lightweight and closed" option.

I have exactly the same problem. Currently what I was able to discover is bellow.
A. Asylo(https://asylo.dev)
Asylo requires programs/algorithms to be written in C++.
Asylo library is integrated in docker and it seems to be feаsable to create custom dоcker image based on Asylo .
Asylo depends on many not so popular technologies like "proto buffers" and "bazel" etc. To me it seems that learning curve will be steep i.e. the person who is creating docker images/(programs) will need a lot of time to understand how to do it.
Asylo is free of charge
Asylo is bright new with all the advantages and disadvantages of being that.
Asylo is produced by Google but it is NOT an officially supported Google product according to the disclaimer on its page.
Asylo promises that data in trusted environment could be saved even from user with root privileges. However, there is lack of documentation and currently it is not clear how this could be implemented.
B. Scone(https://sconedocs.github.io)
It is binded to INTEL SGX technology but also there is Simulation mode(for development).
It is not free. It has just a small set of functionalities which are not paid.
Seems to support a lot of security functionalities.
Easy for use.
They seems to have more documentation and instructions how to build your own docker image with their technology.

For the Python part, you might consider using Pyinstaller, with appropriate options, it can pack your whole python app in a single executable file, which will not require python installation to be run by end users. It effectively runs a python interpreter on the packaged code, but it has a cipher option, which allows you to encrypt the bytecode.
Yes, the key will be somewhere around the executable, and a very savvy costumer might have the means to extract it, thus unraveling a not so readable code. It's up to you to know if your code contains some big secret you need to hide at all costs. I would probably not do it if I wanted to charge big money for any bug solving in the deployed product. I could use it if client has good compliance standards and is not a potential competitor, nor is expected to pay for more licenses.
While I've done this once, I honestly would avoid doing it again.
Regarding the C code, if you can compile it into executables and/or shared libraries can be included in the executable generated by Pyinstaller.

Arbitrary Code Execution with Docker

I'm thinking about building a web app that would involve users writing small segments of python and the server testing that code. However, this presents a ton of security concerns. Would docker be a good isolation tool for running this potentially malicious code? From what I've read, checking system calls with ptrace is a possibility, but I would prefer to use a preexisting tool.

Docker is indeed very suitable for this kind of usage. However, please note that docker is NOT yet ready for production usage.
I would recommend to create a new container and give non-root privileges to your users to this container. One container per user.
This way, you can prepare your docker image and prepare the environment and control precisely what your users are doing :)

What are some successful methods for deploying a Django application on the desktop?

I have a Django application that I would like to deploy to the desktop. I have read a little on this and see that one way is to use freeze. I have used this with varying success in the past for Python applications, but am not convinced it is the best approach for a Django application.
My questions are: what are some successful methods you have used for deploying Django applications? Is there a de facto standard method? Have you hit any dead ends? I need a cross platform solution.

I did this a couple years ago for a Django app running as a local daemon. It was launched by Twisted and wrapped by py2app for Mac and py2exe for Windows. There was both a browser as well as an Air front-end hitting it. It worked pretty well for the most part but I didn't get to deploy it out in the wild because the larger project got postponed. It's been a while and I'm a bit rusty on the details, but here are a few tips:
IIRC, the most problematic thing was Python loading C extensions. I had an Intel assembler module written with C "asm" commands that I needed to load to get low-level system data. That took a while to get working across both platforms. If you can, try to avoid C extensions.
You'll definitely need an installer. Most likely the app will end up running in the background, so you'll need to mark it as a Windows service, Unix daemon, or Mac launchd application.
In your installer you'll want to provide a way to locate a free local TCP port. You may have to write a little stub routine that the installer runs or use the installer's built-in scripting facility to find a port that hasn't been taken and save it to a config file. You then load the config file inside your settings.py and whatever front-end you're going to deploy. That's the shared port. Or you could just pick a random number and hope no other service on the desktop steps on your toes :-)
If your front-end and back-end are separate apps then you'll need to design an API for them to talk to each other. Make sure you provide a flag to return the data in both raw and human-readable form. It really helps in debugging.
If you want Django to be able to send notifications to the user, you'll want to integrate with something like Growl or get Python for Windows extensions so you can bring up toaster pop-up notifications.
You'll probably want to stick with SQLite for database in which case you'll want to make sure you use semaphores to tackle multiple requests vying for the database (or any other shared resource). If your app is accessed via a browser users can have multiple windows open and hit the app at the same time. If using a custom front-end (native, Air, etc...) then you can control how many instances are running at a given time so it won't be as much of an issue.
You'll also want some sort of access to local system logging facilities since the app will be running in the background and make sure you trap all your exceptions and route it into the syslog. A big hassle was debugging Windows service startup issues. It would have been impossible without system logging.
Be careful about hardcoded paths if you want to stay cross-platform. You may have to rely on the installer to write a config file entry with the actual installation path which you'll have to load up at startup.
Test actual deployment especially across a variety of firewalls. Some of the desktop firewalls get pretty aggressive about blocking access to network services that accept incoming requests.
That's all I can think of. Hope it helps.

If you want a good solution, you should give up on making it cross platform. Your code should all be portable, but your deployment - almost by definition - needs to be platform-specific.
I would recommend using py2exe on Windows, py2app on MacOS X, and building deb packages for Ubuntu with a .desktop file in the right place in the package for an entry to show up in the user's menu. Unfortunately for the last option there's no convenient 'py2deb' or 'py2xdg', but it's pretty easy to make the relevant text file by hand.
And of course, I'd recommend bundling in Twisted as your web server for making the application easily self-contained :).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.