I have a long running process that will fetch 100k rows from the db genrate a web page and then release all the small objets (list, tuples and dicts). On windows, after each request the memory is freed. Howerver, on linux, the memory of the server keeps growing.
The following posts describes what the problem is and one possible solution.
http://pushingtheweb.com/2010/06/python-and-tcmalloc/
Is there any other way to get around this problem without having to compile my own python version which uses tcmalloc. That option is going to be very difficult to do, since python is controlled by the sys admin.
You may be able to compile Python in your own working directory rather than try to have the sysadmin replace the system Python.
First you should confirm that the tcmalloc solution solves your problem and does not impact performance too much for your application
Related
I'm about to start working on a project where a Python script is able to remote into a Windows Server and read a bunch of text files in a certain directory. I was planning on using a module called WMI as that is the only way I have been able to successfully remotely access a windows server using Python, But upon further research I'm not sure i am going to be using this module.
The only problem is that, these text files are constantly updating about every 2 seconds and I'm afraid that the script will crash if it comes into an MutEx error where it tries to open the file while it is being rewritten. The only thing I can think of is creating a new directory, copying all the files (via script) into this directory in the state that they are in and reading them from there; and just constantly overwriting these ones with the new ones once it finishes checking all of the old ones. Unfortunately I don't know how to execute this correctly, or efficiently.
How can I go about doing this? Which python module would be best for this execution?
There is Windows support in Ansible these days. It uses winrm. There are plenty of Python libraries that utilize winrm, just google it, but Ansible is very versatile.
http://docs.ansible.com/intro_windows.html
https://msdn.microsoft.com/en-us/library/aa384426%28v=vs.85%29.aspx
I've done some work with WMI before (though not from Python) and I would not try to use it for a project like this. As you said WMI tends to be obscure and my experience says such things are hard to support long-term.
I would either work at the Windows API level, or possibly design a service that performs the desired actions access this service as needed. Of course, you will need to install this service on each machine you need to control. Both approaches have merit. The WinAPI approach pretty much guarantees you don't invent any new security holes and is simpler initially. The service approach should make the application faster and required less network traffic. I am sure you can think of others easily.
You still have to have the necessary permissions, network ports, etc. regardless of the approach. E.g., WMI is usually blocked by firewalls and you still run as some NT process.
Sorry, not really an answer as such -- meant as a long comment.
ADDED
Re: API programming, though you have no Windows API experience, I expect you find it familiar for tasks such as you describe, i.e., reading and writing files, scanning directories are nothing unique to Windows. You only need to learn about the parts of the API that interest you.
Once you create the appropriate security contexts and start your client process, there is nothing service-oriented in the, i.e., your can simply open and close files, etc., ignoring that fact that the files are remote, other than server name being included in the UNC name of the file/folder location.
We all know situations when you cannot go open source and freely distribute software - and I am in one of these situations.
I have an app that consists of a number of binaries (compiled from C sources) and Python code that wraps it all into a system. This app used to work as a cloud solution so users had access to app functions via network but no chance to touch the actual server where binaries and code are stored.
Now we want to deliver the "local" version of our system. The app will be running on PCs that our users will physically own. We know that everything could be broken, but at least want to protect the app from possible copying and reverse-engineering as much as possible.
I know that Docker is a wonderful deployment tool so I wonder: is it possible to create encrypted Docker containers where no one can see any data stored in the container's filesystem? Is there a known solution to this problem?
Also, maybe there are well known solutions not based on Docker?
The root user on the host machine (where the docker daemon runs) has full access to all the processes running on the host. That means the person who controls the host machine can always get access to the RAM of the application as well as the file system. That makes it impossible to hide a key for decrypting the file system or protecting RAM from debugging.
Using obfuscation on a standard Linux box, you can make it harder to read the file system and RAM, but you can't make it impossible or the container cannot run.
If you can control the hardware running the operating system, then you might want to look at the Trusted Platform Module which starts system verification as soon as the system boots. You could then theoretically do things before the root user has access to the system to hide keys and strongly encrypt file systems. Even then, given physical access to the machine, a determined attacker can always get the decrypted data.
What you are asking about is called obfuscation. It has nothing to do with Docker and is a very language-specific problem; for data you can always do whatever mangling you want, but while you can hope to discourage the attacker it will never be secure. Even state-of-the-art encryption schemes can't help since the program (which you provide) has to contain the key.
C is usually hard enough to reverse engineer, for Python you can try pyobfuscate and similar.
For data, I found this question (keywords: encrypting files game).
If you want a completely secure solution, you're searching for the 'holy grail' of confidentiality: homomorphous encryption. In short, you want to encrypt your application and data, send them to a PC, and have this PC run them without its owner, OS, or anyone else being able to scoop at the data.
Doing so without a massive performance penalty is an active research project. There has been at least one project having managed this, but it still has limitations:
It's windows-only
The CPU has access to the key (ie, you have to trust Intel)
It's optimised for cloud scenarios. If you want to install this to multiple PCs, you need to provide the key in a secure way (ie just go there and type it yourself) to one of the PCs you're going to install your application, and this PC should be able to securely propagate the key to the other PCs.
Andy's suggestion on using the TPM has similar implications to points 2 and 3.
Sounds like Docker is not the right tool, because it was never intended to be used as a full-blown sandbox (at least based on what I've been reading). Why aren't you using a more full-blown VirtualBox approach? At least then you're able to lock up the virtual machine behind logins (as much as a physical installation on someone else's computer can be locked up) and run it isolated, encrypted filesystems and the whole nine yards.
You can either go lightweight and open, or fat and closed. I don't know that there's a "lightweight and closed" option.
I have exactly the same problem. Currently what I was able to discover is bellow.
A. Asylo(https://asylo.dev)
Asylo requires programs/algorithms to be written in C++.
Asylo library is integrated in docker and it seems to be feаsable to create custom dоcker image based on Asylo .
Asylo depends on many not so popular technologies like "proto buffers" and "bazel" etc. To me it seems that learning curve will be steep i.e. the person who is creating docker images/(programs) will need a lot of time to understand how to do it.
Asylo is free of charge
Asylo is bright new with all the advantages and disadvantages of being that.
Asylo is produced by Google but it is NOT an officially supported Google product according to the disclaimer on its page.
Asylo promises that data in trusted environment could be saved even from user with root privileges. However, there is lack of documentation and currently it is not clear how this could be implemented.
B. Scone(https://sconedocs.github.io)
It is binded to INTEL SGX technology but also there is Simulation mode(for development).
It is not free. It has just a small set of functionalities which are not paid.
Seems to support a lot of security functionalities.
Easy for use.
They seems to have more documentation and instructions how to build your own docker image with their technology.
For the Python part, you might consider using Pyinstaller, with appropriate options, it can pack your whole python app in a single executable file, which will not require python installation to be run by end users. It effectively runs a python interpreter on the packaged code, but it has a cipher option, which allows you to encrypt the bytecode.
Yes, the key will be somewhere around the executable, and a very savvy costumer might have the means to extract it, thus unraveling a not so readable code. It's up to you to know if your code contains some big secret you need to hide at all costs. I would probably not do it if I wanted to charge big money for any bug solving in the deployed product. I could use it if client has good compliance standards and is not a potential competitor, nor is expected to pay for more licenses.
While I've done this once, I honestly would avoid doing it again.
Regarding the C code, if you can compile it into executables and/or shared libraries can be included in the executable generated by Pyinstaller.
What is the best method to push changes to a program written in Python? I have a piece of software that is written in Python that will regularly be updated. What would be the best way to do this? All the machines will have Windows 7.
Also, excuse the ambiguity of my question. This will be my first time having to implement an updating procedure. Feel free to mention specifics you would like me ot add.
If you're not already packaging your program with InnoSetup, I strongly recommend you switch to it, because it has facilities to make this sort of thing easier. You can specify any special situations, such as files that should not be updated if they already exist (i.e. if you have any internal configuration files or things like that), in the InnoSetup script.
Next, to allow the client machine to find out about new versions of your app, keep a very small file on your public web server that has the version number of the current release and the URL to the latest version's installer exe. For this file to be useful, whenever you release a newer version of your program you must update this file, as well as the version number in the InnoSetup script, and also some kind of APP_VERSION constant in your program.
Then, you'll need to handle these parts of the updater yourself:
Detecting when a newer version is available by retrieving the current-version file from your web server over HTTP, and comparing the version number there to the app's own APP_VERSION. Make sure to do this query in a way that fails gracefully if the client machine doesn't have Internet access, and that doesn't block the GUI while it is doing the request (in case there's a network issue that forces the query to wait a long while for a timeout).
If a newer version is available, asking the user if they want to update, and if they say yes downloading an updated installer to the TEMP directory. Depending on what GUI toolkit you are using, there are various mechanisms for displaying a progress dialog during the download; this is a good idea since the installer is likely to be at least an MB.
Closing your app, running a special update script in the background, then starting up the app again.
The update script will wait for the original process to die completely (easiest way to do this is to pass in the original process's PID as a command line argument and have the update script send a query signal 0 to that process every second or so until it goes away.) It can then run the installer silently in the background, perhaps while displaying a "Please Wait..." dialog to the user. Once the installer is done and reports success in its return code, the updater can restart your program.
Depending on how big your app is, this is more wasteful of bandwidth than the method using git or another SCM. Every update with this approach would involve downloading the entire installer for the latest version of the app, whereas an SCM would only download the files that have changed. However, it has the advantage that it requires no special server facilities except a regular web server, and no special installation of the SCM client on the user's computer.
Plus, InnoSetup is just generally cool. :-)
I would suggest using a source control program such as git or subversion. Also, if you are okay with everyone seeing the code, you can post the code on github, where anyone can pull from it. You could make it private, but you would have to pay for it and all the users would also have create a github account and set it up with their git install.
If you use a source control program, the other people will have to pull the edits manually by running a command, but you could make a script pr batch file that does this and have it run at start up or at regular intervals.
Just to be clear, if you want to do this, you will have to put the code on a server with and SSH support and set up git. If you don't want to go through all of the server set up, I would reccomend github.
git- http://git-scm.com/ (For windows version, go to downloads and select msysGit)
github - https://github.com/
For those of you that would be looking into something a little less dated, I was just looking at how to create python applications that can be updated remotely (though not limited to Windows like OP).
It seems like esky as been a solution for a while. Though it's been deprecated since 2018.
The latest and most up to date solution seem to be a combination of pyinstaller and pyupdater. Note that I don't have personal experience with it, I'm looking for a friend.
It seems to support windows, linux and Mac though and both python 2 and 3 so definitely worth having a look.
The basic principles of application updates are described well by DSimon's answer.
However, update security is a different matter altogether: You don't want your clients to end up installing malicious files.
PyUpdater, as suggested in jlengrand's answer, does provide some secure update functionality, but, unfortunately, PyUpdater 4.0 is broken and there has not been a new release in over half a year (now Aug 2022).
There's also python-tuf, which is the reference implementation of The Update Framework (TUF).
TUF (python-tuf) does everything humanly possible to ensure your update files are distributed securely. However, it does not handle application-specific things like checking for new application versions and installation on the client side.
I'm working on an embedded device that runs Linux on ARM7 with 64MB RAM and 64MB storage (12MB free). The device should be configured via web therefore it needs to run an embedded web server. Currently it's using Lighttpd and LUA, but I'm thinking about replacing LUA (or maybe even Lighttpd) with Python. The server will occasionally be accessed by one or two users for making changes to internal settings of the C program that is running in Linux. So the server load isn't really a lot. I also need it to be Open Source Software. Web.py seems to be small enough but I still need to compile Python which I haven't done before. So I'm wondering what are the system requirements of Python? LUA seems to do quite well for small embedded systems but I don't like its syntax for C-binding.
However, I couldn't find updated information about system requirements for embedding Python in such settings. This page from Michael Lauer seems to be old.
Any ideas? Suggestions? hints? links?
I'm working on this device using OpenWRT + Python:
http://wiki.openwrt.org/oldwiki/OpenWrtDocs/Hardware/Meraki/Mini
The first python run is veeeeeery slow but it are metacompiling all .pyc files, next it work well.
I have a Django application that I would like to deploy to the desktop. I have read a little on this and see that one way is to use freeze. I have used this with varying success in the past for Python applications, but am not convinced it is the best approach for a Django application.
My questions are: what are some successful methods you have used for deploying Django applications? Is there a de facto standard method? Have you hit any dead ends? I need a cross platform solution.
I did this a couple years ago for a Django app running as a local daemon. It was launched by Twisted and wrapped by py2app for Mac and py2exe for Windows. There was both a browser as well as an Air front-end hitting it. It worked pretty well for the most part but I didn't get to deploy it out in the wild because the larger project got postponed. It's been a while and I'm a bit rusty on the details, but here are a few tips:
IIRC, the most problematic thing was Python loading C extensions. I had an Intel assembler module written with C "asm" commands that I needed to load to get low-level system data. That took a while to get working across both platforms. If you can, try to avoid C extensions.
You'll definitely need an installer. Most likely the app will end up running in the background, so you'll need to mark it as a Windows service, Unix daemon, or Mac launchd application.
In your installer you'll want to provide a way to locate a free local TCP port. You may have to write a little stub routine that the installer runs or use the installer's built-in scripting facility to find a port that hasn't been taken and save it to a config file. You then load the config file inside your settings.py and whatever front-end you're going to deploy. That's the shared port. Or you could just pick a random number and hope no other service on the desktop steps on your toes :-)
If your front-end and back-end are separate apps then you'll need to design an API for them to talk to each other. Make sure you provide a flag to return the data in both raw and human-readable form. It really helps in debugging.
If you want Django to be able to send notifications to the user, you'll want to integrate with something like Growl or get Python for Windows extensions so you can bring up toaster pop-up notifications.
You'll probably want to stick with SQLite for database in which case you'll want to make sure you use semaphores to tackle multiple requests vying for the database (or any other shared resource). If your app is accessed via a browser users can have multiple windows open and hit the app at the same time. If using a custom front-end (native, Air, etc...) then you can control how many instances are running at a given time so it won't be as much of an issue.
You'll also want some sort of access to local system logging facilities since the app will be running in the background and make sure you trap all your exceptions and route it into the syslog. A big hassle was debugging Windows service startup issues. It would have been impossible without system logging.
Be careful about hardcoded paths if you want to stay cross-platform. You may have to rely on the installer to write a config file entry with the actual installation path which you'll have to load up at startup.
Test actual deployment especially across a variety of firewalls. Some of the desktop firewalls get pretty aggressive about blocking access to network services that accept incoming requests.
That's all I can think of. Hope it helps.
If you want a good solution, you should give up on making it cross platform. Your code should all be portable, but your deployment - almost by definition - needs to be platform-specific.
I would recommend using py2exe on Windows, py2app on MacOS X, and building deb packages for Ubuntu with a .desktop file in the right place in the package for an entry to show up in the user's menu. Unfortunately for the last option there's no convenient 'py2deb' or 'py2xdg', but it's pretty easy to make the relevant text file by hand.
And of course, I'd recommend bundling in Twisted as your web server for making the application easily self-contained :).