Encrypted and secure docker containers - python

We all know situations when you cannot go open source and freely distribute software - and I am in one of these situations.
I have an app that consists of a number of binaries (compiled from C sources) and Python code that wraps it all into a system. This app used to work as a cloud solution so users had access to app functions via network but no chance to touch the actual server where binaries and code are stored.
Now we want to deliver the "local" version of our system. The app will be running on PCs that our users will physically own. We know that everything could be broken, but at least want to protect the app from possible copying and reverse-engineering as much as possible.
I know that Docker is a wonderful deployment tool so I wonder: is it possible to create encrypted Docker containers where no one can see any data stored in the container's filesystem? Is there a known solution to this problem?
Also, maybe there are well known solutions not based on Docker?

The root user on the host machine (where the docker daemon runs) has full access to all the processes running on the host. That means the person who controls the host machine can always get access to the RAM of the application as well as the file system. That makes it impossible to hide a key for decrypting the file system or protecting RAM from debugging.
Using obfuscation on a standard Linux box, you can make it harder to read the file system and RAM, but you can't make it impossible or the container cannot run.
If you can control the hardware running the operating system, then you might want to look at the Trusted Platform Module which starts system verification as soon as the system boots. You could then theoretically do things before the root user has access to the system to hide keys and strongly encrypt file systems. Even then, given physical access to the machine, a determined attacker can always get the decrypted data.

What you are asking about is called obfuscation. It has nothing to do with Docker and is a very language-specific problem; for data you can always do whatever mangling you want, but while you can hope to discourage the attacker it will never be secure. Even state-of-the-art encryption schemes can't help since the program (which you provide) has to contain the key.
C is usually hard enough to reverse engineer, for Python you can try pyobfuscate and similar.
For data, I found this question (keywords: encrypting files game).

If you want a completely secure solution, you're searching for the 'holy grail' of confidentiality: homomorphous encryption. In short, you want to encrypt your application and data, send them to a PC, and have this PC run them without its owner, OS, or anyone else being able to scoop at the data.
Doing so without a massive performance penalty is an active research project. There has been at least one project having managed this, but it still has limitations:
It's windows-only
The CPU has access to the key (ie, you have to trust Intel)
It's optimised for cloud scenarios. If you want to install this to multiple PCs, you need to provide the key in a secure way (ie just go there and type it yourself) to one of the PCs you're going to install your application, and this PC should be able to securely propagate the key to the other PCs.
Andy's suggestion on using the TPM has similar implications to points 2 and 3.

Sounds like Docker is not the right tool, because it was never intended to be used as a full-blown sandbox (at least based on what I've been reading). Why aren't you using a more full-blown VirtualBox approach? At least then you're able to lock up the virtual machine behind logins (as much as a physical installation on someone else's computer can be locked up) and run it isolated, encrypted filesystems and the whole nine yards.
You can either go lightweight and open, or fat and closed. I don't know that there's a "lightweight and closed" option.

I have exactly the same problem. Currently what I was able to discover is bellow.
A. Asylo(https://asylo.dev)
Asylo requires programs/algorithms to be written in C++.
Asylo library is integrated in docker and it seems to be feаsable to create custom dоcker image based on Asylo .
Asylo depends on many not so popular technologies like "proto buffers" and "bazel" etc. To me it seems that learning curve will be steep i.e. the person who is creating docker images/(programs) will need a lot of time to understand how to do it.
Asylo is free of charge
Asylo is bright new with all the advantages and disadvantages of being that.
Asylo is produced by Google but it is NOT an officially supported Google product according to the disclaimer on its page.
Asylo promises that data in trusted environment could be saved even from user with root privileges. However, there is lack of documentation and currently it is not clear how this could be implemented.
B. Scone(https://sconedocs.github.io)
It is binded to INTEL SGX technology but also there is Simulation mode(for development).
It is not free. It has just a small set of functionalities which are not paid.
Seems to support a lot of security functionalities.
Easy for use.
They seems to have more documentation and instructions how to build your own docker image with their technology.

For the Python part, you might consider using Pyinstaller, with appropriate options, it can pack your whole python app in a single executable file, which will not require python installation to be run by end users. It effectively runs a python interpreter on the packaged code, but it has a cipher option, which allows you to encrypt the bytecode.
Yes, the key will be somewhere around the executable, and a very savvy costumer might have the means to extract it, thus unraveling a not so readable code. It's up to you to know if your code contains some big secret you need to hide at all costs. I would probably not do it if I wanted to charge big money for any bug solving in the deployed product. I could use it if client has good compliance standards and is not a potential competitor, nor is expected to pay for more licenses.
While I've done this once, I honestly would avoid doing it again.
Regarding the C code, if you can compile it into executables and/or shared libraries can be included in the executable generated by Pyinstaller.

Related

Is distributing python source code in Docker secure?

I am about to decide on programming language for the project.
The requirements are that some of customers want to run application on isolated servers without external internet access.
To do that I need to distribute application to them and cannot use SaaS approach running on, for example, my cloud (what I'd prefer to do...).
The problem is that if I decide to use Python for developing this, I would need to provide customer with easy readable code which is not really what I'd like to do (of course, I know about all that "do you really need to protect your source code" kind of questions but it's out of scope for now).
One of my colleagues told me about Docker. I can find dozen of answers about Docker container security. Problem is all that is about protecting (isolating) host from code running in container.
What I need is to know if the Python source code in the Docker Image and running in Docker Container is secured from access - can user in some way (doesn't need to be easy) access that Python code?
I know I can't protect everything, I know it is possible to decompile/crack everything. I just want to know the answer just to decide whether the way to access my code inside Docker is hard enough that I can take the risk.
Docker images are an open and documented "application packaging" format. There are countless ways to inspect the image contents, including all of the python source code shipped inside of them.
Running applications inside of a container provides isolation from the application escaping the container to access the host. They do not protect you from users on the host inspecting what is occurring inside of the container.
Python programs are distributed as source code. If it can run on a client machine, then the code is readable on that machine. A docker container only contains the application and its libraries, external binaries and files, not a full OS. As the security can only be managed at OS level (or through encryption) and as the OS is under client control, the client can read any file on the docker container, including your Python source.
If you really want to go that way, you should consider providing a full Virtual Machine to your client. In that case, the VM contains a full OS with its account based security (administrative account passwords on the VM can be different from those of the host). Is is far from still waters, because it means that the client will be enable to setup or adapt networking on the VM among other problems...
And you should be aware the the client security officer could emit a strong NO when it comes to running a non controlled VM on their network. I would never accept it.
Anyway, as the client has full access to the VM, really securing it will be hard if ever possible (disable booting from an additional device may even not be possible). It is admitted in security that if the attacker has physical access, you have lost.
TL/DR: It in not the expected answer but just don't. It you sell your solution you will have a legal contract with your customer, and that kind of problem should be handled at a legal level, not a technical one. You can try, and I have even given you a hint, but IMHO the risks are higher than the gain.
I know that´s been more than 3 years, but... looking for the same kind of solution I think that including compiled python code -not your source code- inside the container would be a challenging trial for someone trying to access your valuable source code.
If you run pyinstaller --onefile yourscript.py you will get a compiled single file that can be run as an executable. I have only tested it in Raspberry, but as far as I know it´s the same for, say, Windows.
Of course anything can be reverse engineered, but hopefully it won´t be worth the effort to the regular end user.
I think it could be a solution as using a "container" to protect our code from the person we wouldn't let them access. the problem is docker is not a secure container. As the root of the host machine has the most powerful control of the Docker container, we don't have any method to protect the root from accessing inside of the container.
I just have some ideas about a secure container:
Build a container with init file like docker file, a password must be set when the container is created;
once the container is built, we have to use a password to access inside, including
reading\copy\modify files
all the files stored on the host machine should be encypt。
no "retrieve password" or “--skip-grant-” mode is offered. that means nobody can
access the data inside the container if u lost the password.
If we have a trustable container where we can run tomcat or Django server, code obfuscation will not be necessary.

python copying directory and reading text files Remotely

I'm about to start working on a project where a Python script is able to remote into a Windows Server and read a bunch of text files in a certain directory. I was planning on using a module called WMI as that is the only way I have been able to successfully remotely access a windows server using Python, But upon further research I'm not sure i am going to be using this module.
The only problem is that, these text files are constantly updating about every 2 seconds and I'm afraid that the script will crash if it comes into an MutEx error where it tries to open the file while it is being rewritten. The only thing I can think of is creating a new directory, copying all the files (via script) into this directory in the state that they are in and reading them from there; and just constantly overwriting these ones with the new ones once it finishes checking all of the old ones. Unfortunately I don't know how to execute this correctly, or efficiently.
How can I go about doing this? Which python module would be best for this execution?
There is Windows support in Ansible these days. It uses winrm. There are plenty of Python libraries that utilize winrm, just google it, but Ansible is very versatile.
http://docs.ansible.com/intro_windows.html
https://msdn.microsoft.com/en-us/library/aa384426%28v=vs.85%29.aspx
I've done some work with WMI before (though not from Python) and I would not try to use it for a project like this. As you said WMI tends to be obscure and my experience says such things are hard to support long-term.
I would either work at the Windows API level, or possibly design a service that performs the desired actions access this service as needed. Of course, you will need to install this service on each machine you need to control. Both approaches have merit. The WinAPI approach pretty much guarantees you don't invent any new security holes and is simpler initially. The service approach should make the application faster and required less network traffic. I am sure you can think of others easily.
You still have to have the necessary permissions, network ports, etc. regardless of the approach. E.g., WMI is usually blocked by firewalls and you still run as some NT process.
Sorry, not really an answer as such -- meant as a long comment.
ADDED
Re: API programming, though you have no Windows API experience, I expect you find it familiar for tasks such as you describe, i.e., reading and writing files, scanning directories are nothing unique to Windows. You only need to learn about the parts of the API that interest you.
Once you create the appropriate security contexts and start your client process, there is nothing service-oriented in the, i.e., your can simply open and close files, etc., ignoring that fact that the files are remote, other than server name being included in the UNC name of the file/folder location.

How to remotely update Python applications

What is the best method to push changes to a program written in Python? I have a piece of software that is written in Python that will regularly be updated. What would be the best way to do this? All the machines will have Windows 7.
Also, excuse the ambiguity of my question. This will be my first time having to implement an updating procedure. Feel free to mention specifics you would like me ot add.
If you're not already packaging your program with InnoSetup, I strongly recommend you switch to it, because it has facilities to make this sort of thing easier. You can specify any special situations, such as files that should not be updated if they already exist (i.e. if you have any internal configuration files or things like that), in the InnoSetup script.
Next, to allow the client machine to find out about new versions of your app, keep a very small file on your public web server that has the version number of the current release and the URL to the latest version's installer exe. For this file to be useful, whenever you release a newer version of your program you must update this file, as well as the version number in the InnoSetup script, and also some kind of APP_VERSION constant in your program.
Then, you'll need to handle these parts of the updater yourself:
Detecting when a newer version is available by retrieving the current-version file from your web server over HTTP, and comparing the version number there to the app's own APP_VERSION. Make sure to do this query in a way that fails gracefully if the client machine doesn't have Internet access, and that doesn't block the GUI while it is doing the request (in case there's a network issue that forces the query to wait a long while for a timeout).
If a newer version is available, asking the user if they want to update, and if they say yes downloading an updated installer to the TEMP directory. Depending on what GUI toolkit you are using, there are various mechanisms for displaying a progress dialog during the download; this is a good idea since the installer is likely to be at least an MB.
Closing your app, running a special update script in the background, then starting up the app again.
The update script will wait for the original process to die completely (easiest way to do this is to pass in the original process's PID as a command line argument and have the update script send a query signal 0 to that process every second or so until it goes away.) It can then run the installer silently in the background, perhaps while displaying a "Please Wait..." dialog to the user. Once the installer is done and reports success in its return code, the updater can restart your program.
Depending on how big your app is, this is more wasteful of bandwidth than the method using git or another SCM. Every update with this approach would involve downloading the entire installer for the latest version of the app, whereas an SCM would only download the files that have changed. However, it has the advantage that it requires no special server facilities except a regular web server, and no special installation of the SCM client on the user's computer.
Plus, InnoSetup is just generally cool. :-)
I would suggest using a source control program such as git or subversion. Also, if you are okay with everyone seeing the code, you can post the code on github, where anyone can pull from it. You could make it private, but you would have to pay for it and all the users would also have create a github account and set it up with their git install.
If you use a source control program, the other people will have to pull the edits manually by running a command, but you could make a script pr batch file that does this and have it run at start up or at regular intervals.
Just to be clear, if you want to do this, you will have to put the code on a server with and SSH support and set up git. If you don't want to go through all of the server set up, I would reccomend github.
git- http://git-scm.com/ (For windows version, go to downloads and select msysGit)
github - https://github.com/
For those of you that would be looking into something a little less dated, I was just looking at how to create python applications that can be updated remotely (though not limited to Windows like OP).
It seems like esky as been a solution for a while. Though it's been deprecated since 2018.
The latest and most up to date solution seem to be a combination of pyinstaller and pyupdater. Note that I don't have personal experience with it, I'm looking for a friend.
It seems to support windows, linux and Mac though and both python 2 and 3 so definitely worth having a look.
The basic principles of application updates are described well by DSimon's answer.
However, update security is a different matter altogether: You don't want your clients to end up installing malicious files.
PyUpdater, as suggested in jlengrand's answer, does provide some secure update functionality, but, unfortunately, PyUpdater 4.0 is broken and there has not been a new release in over half a year (now Aug 2022).
There's also python-tuf, which is the reference implementation of The Update Framework (TUF).
TUF (python-tuf) does everything humanly possible to ensure your update files are distributed securely. However, it does not handle application-specific things like checking for new application versions and installation on the client side.

Considerations for moving our web app to an appliance

We have developed a web application running on Linux that is quite popular. We now wish to release it as an appliance so customers can run it internally on their own networks.
We are unsure of the best approach. We are flexible on areas such as: the Linux distro, whether it's a hardware or software only appliance. Does anyone have any advice on the best way to go about this? Links to any good resources on the subject? Questions we should be asking ourselves? Legal considerations for a commercial app? Security considerations?
UPDATE:
It's a Python based web application. We would like the user to be able to do everything via a web interface. No command line stuff etc.
I know when Github needed to do something similar, they contracted with a company who specializes in building installers called BitRock.
If you want to develop the solution yourself, you can't go wrong building a Debian package (or RPM if you prefer). That's what most linux system admins would be comfortable with, and there are very well-known ways to give them a mix of customization/control while also making the process easy to manage on your end. That gives you and your users a very well-known update path as well.
Unless you have had very specific requests from customers, I would shy away from a turnkey-style appliance where you provide hardware. It's extra work for you and could be a turn-off to customers. Different business have different needs though, so maybe your client-base doesn't have IT and would prefer an all-in-one solution. Until you ask, you can't be sure though.
It depends on what langauge/tecnology application is written.
If it is java, release war file + tomcat/jboss.
If it is python, release eggs.
If it is php... not sure, probably just .tar.bz2.
Linux distro or virtual image mught be advantage, but I dislike using them, because they are usually does not fits to my infra (why do I have to install some custom debian-based distro to my rhel-only infrastructure?).

What are some successful methods for deploying a Django application on the desktop?

I have a Django application that I would like to deploy to the desktop. I have read a little on this and see that one way is to use freeze. I have used this with varying success in the past for Python applications, but am not convinced it is the best approach for a Django application.
My questions are: what are some successful methods you have used for deploying Django applications? Is there a de facto standard method? Have you hit any dead ends? I need a cross platform solution.
I did this a couple years ago for a Django app running as a local daemon. It was launched by Twisted and wrapped by py2app for Mac and py2exe for Windows. There was both a browser as well as an Air front-end hitting it. It worked pretty well for the most part but I didn't get to deploy it out in the wild because the larger project got postponed. It's been a while and I'm a bit rusty on the details, but here are a few tips:
IIRC, the most problematic thing was Python loading C extensions. I had an Intel assembler module written with C "asm" commands that I needed to load to get low-level system data. That took a while to get working across both platforms. If you can, try to avoid C extensions.
You'll definitely need an installer. Most likely the app will end up running in the background, so you'll need to mark it as a Windows service, Unix daemon, or Mac launchd application.
In your installer you'll want to provide a way to locate a free local TCP port. You may have to write a little stub routine that the installer runs or use the installer's built-in scripting facility to find a port that hasn't been taken and save it to a config file. You then load the config file inside your settings.py and whatever front-end you're going to deploy. That's the shared port. Or you could just pick a random number and hope no other service on the desktop steps on your toes :-)
If your front-end and back-end are separate apps then you'll need to design an API for them to talk to each other. Make sure you provide a flag to return the data in both raw and human-readable form. It really helps in debugging.
If you want Django to be able to send notifications to the user, you'll want to integrate with something like Growl or get Python for Windows extensions so you can bring up toaster pop-up notifications.
You'll probably want to stick with SQLite for database in which case you'll want to make sure you use semaphores to tackle multiple requests vying for the database (or any other shared resource). If your app is accessed via a browser users can have multiple windows open and hit the app at the same time. If using a custom front-end (native, Air, etc...) then you can control how many instances are running at a given time so it won't be as much of an issue.
You'll also want some sort of access to local system logging facilities since the app will be running in the background and make sure you trap all your exceptions and route it into the syslog. A big hassle was debugging Windows service startup issues. It would have been impossible without system logging.
Be careful about hardcoded paths if you want to stay cross-platform. You may have to rely on the installer to write a config file entry with the actual installation path which you'll have to load up at startup.
Test actual deployment especially across a variety of firewalls. Some of the desktop firewalls get pretty aggressive about blocking access to network services that accept incoming requests.
That's all I can think of. Hope it helps.
If you want a good solution, you should give up on making it cross platform. Your code should all be portable, but your deployment - almost by definition - needs to be platform-specific.
I would recommend using py2exe on Windows, py2app on MacOS X, and building deb packages for Ubuntu with a .desktop file in the right place in the package for an entry to show up in the user's menu. Unfortunately for the last option there's no convenient 'py2deb' or 'py2xdg', but it's pretty easy to make the relevant text file by hand.
And of course, I'd recommend bundling in Twisted as your web server for making the application easily self-contained :).

Categories