Related
There has been some back and forth between myself and the IT department of a company I recently began working regarding the installation of Python / Anaconda suite on my work PC. The IT department is making claims of security risks (with Anaconda) but I suspect it’s more of a matter of them not wanting to give me access. My suspicion is based, not on my IT knowledge, but due to the fact that I’ve used Anaconda at my last job with no issues. I’m hoping for some insight of enterprise risks (if any) associated with installation of Anaconda. To summarize the situation/my knowledge:
• I am not a developer, nor do I come from an IT/enterprise risk background. I’ve used Python for analytics, data cleansing and report automation
• Current and past companies are within the finance industry, i.e. confidential information lives on the network
• I’m requesting Anaconda as opposed Anaconda Enterprise
• I’m requesting Python version 3.6.4
I’m not trying to write-off IT’s concerns. What I’m trying to do is better understand the situation, educate myself and either alleviate their concerns or propose an alternative all parties can work with.
So my questions are:
Are their security threats associated with leveraging Anaconda? If so, what specifically?
If the risk is too great, what are alternatives to simply installing the Anaconda Suite?
Thanks
Are their security threats associated with leveraging Anaconda? If so, what specifically?
It depends on the environment. Do you have admin privileges ? How the global GPO policy looks like ? For example if you don't have an admin rights you can't do things like create a socket, access network stack on os level and vice versa. The thing is, that you can do similar damage with CMD/PowerShell also. Is the former secure than the latter, I don't think so ....
If the risk is too great, what are alternatives to simply installing the Anaconda Suite?
It depends on what kind of functionality you need from Anaconda, maybe you can use different python interpreter/framework, but from security perspective looks the same.
Usually "IT Stuff" don't have a clue how to implement OS/Domain/Security in a proper manner so their solution is to tell everybody that it's a security risk. In these days, everything is a security risk.
One could argue that any programming environment is a potential security risk.
But such an argument is rather unconvincing in any office environment where you will probably find a well-known office suite installed that comes with its own built-in programming environment! You can cause plenty of mayhem with VBA macros hidden in Word en Excel documents... Especially when these documents are sent to unsuspecting co-workers.
To put it more bluntly, any environment that is not nailed down to the point of being unusable is a potential security risk. Can you e.g. run programs off a USB drive? Or open Office files from them? Or even copy files to them?
You could ask IT what their specific objections are?
BTW; You don't need administrator privileges to install Anaconda for yourself. Only if you want to install it for all users on your machine.
Edit: Another approach is to bring your own laptop and use Python on that.
This is an approach I'm using. And I would suggest using a laptop loaded with a UNIX-like operating system like a Linux distribution or one of the BSD variants. All of these come with a lot of tools out of the box and since they have decent package management (as opposed to ms-windows) basically every open-source tool you can think of is easily available. There is a learning curve associated with this, but on these systems tools are meant to work together instead of existing as pre-packaged one-trick ponies.
For example, I keep yearly logbooks of work-related stuff that should be documented. These logbooks can run into 200-300 pages, with hundreds of illustrations and graphs. For such a thing, ms-word just doesn't cut it; I've tried several times. So I use LaTeX, python, gnuplot and a host of other tools for it. And the whole thing is kept under revision control as a matter of course.
This laptop isn't connected to the network at work, so it cannot be a threat to that.
We all know situations when you cannot go open source and freely distribute software - and I am in one of these situations.
I have an app that consists of a number of binaries (compiled from C sources) and Python code that wraps it all into a system. This app used to work as a cloud solution so users had access to app functions via network but no chance to touch the actual server where binaries and code are stored.
Now we want to deliver the "local" version of our system. The app will be running on PCs that our users will physically own. We know that everything could be broken, but at least want to protect the app from possible copying and reverse-engineering as much as possible.
I know that Docker is a wonderful deployment tool so I wonder: is it possible to create encrypted Docker containers where no one can see any data stored in the container's filesystem? Is there a known solution to this problem?
Also, maybe there are well known solutions not based on Docker?
The root user on the host machine (where the docker daemon runs) has full access to all the processes running on the host. That means the person who controls the host machine can always get access to the RAM of the application as well as the file system. That makes it impossible to hide a key for decrypting the file system or protecting RAM from debugging.
Using obfuscation on a standard Linux box, you can make it harder to read the file system and RAM, but you can't make it impossible or the container cannot run.
If you can control the hardware running the operating system, then you might want to look at the Trusted Platform Module which starts system verification as soon as the system boots. You could then theoretically do things before the root user has access to the system to hide keys and strongly encrypt file systems. Even then, given physical access to the machine, a determined attacker can always get the decrypted data.
What you are asking about is called obfuscation. It has nothing to do with Docker and is a very language-specific problem; for data you can always do whatever mangling you want, but while you can hope to discourage the attacker it will never be secure. Even state-of-the-art encryption schemes can't help since the program (which you provide) has to contain the key.
C is usually hard enough to reverse engineer, for Python you can try pyobfuscate and similar.
For data, I found this question (keywords: encrypting files game).
If you want a completely secure solution, you're searching for the 'holy grail' of confidentiality: homomorphous encryption. In short, you want to encrypt your application and data, send them to a PC, and have this PC run them without its owner, OS, or anyone else being able to scoop at the data.
Doing so without a massive performance penalty is an active research project. There has been at least one project having managed this, but it still has limitations:
It's windows-only
The CPU has access to the key (ie, you have to trust Intel)
It's optimised for cloud scenarios. If you want to install this to multiple PCs, you need to provide the key in a secure way (ie just go there and type it yourself) to one of the PCs you're going to install your application, and this PC should be able to securely propagate the key to the other PCs.
Andy's suggestion on using the TPM has similar implications to points 2 and 3.
Sounds like Docker is not the right tool, because it was never intended to be used as a full-blown sandbox (at least based on what I've been reading). Why aren't you using a more full-blown VirtualBox approach? At least then you're able to lock up the virtual machine behind logins (as much as a physical installation on someone else's computer can be locked up) and run it isolated, encrypted filesystems and the whole nine yards.
You can either go lightweight and open, or fat and closed. I don't know that there's a "lightweight and closed" option.
I have exactly the same problem. Currently what I was able to discover is bellow.
A. Asylo(https://asylo.dev)
Asylo requires programs/algorithms to be written in C++.
Asylo library is integrated in docker and it seems to be feаsable to create custom dоcker image based on Asylo .
Asylo depends on many not so popular technologies like "proto buffers" and "bazel" etc. To me it seems that learning curve will be steep i.e. the person who is creating docker images/(programs) will need a lot of time to understand how to do it.
Asylo is free of charge
Asylo is bright new with all the advantages and disadvantages of being that.
Asylo is produced by Google but it is NOT an officially supported Google product according to the disclaimer on its page.
Asylo promises that data in trusted environment could be saved even from user with root privileges. However, there is lack of documentation and currently it is not clear how this could be implemented.
B. Scone(https://sconedocs.github.io)
It is binded to INTEL SGX technology but also there is Simulation mode(for development).
It is not free. It has just a small set of functionalities which are not paid.
Seems to support a lot of security functionalities.
Easy for use.
They seems to have more documentation and instructions how to build your own docker image with their technology.
For the Python part, you might consider using Pyinstaller, with appropriate options, it can pack your whole python app in a single executable file, which will not require python installation to be run by end users. It effectively runs a python interpreter on the packaged code, but it has a cipher option, which allows you to encrypt the bytecode.
Yes, the key will be somewhere around the executable, and a very savvy costumer might have the means to extract it, thus unraveling a not so readable code. It's up to you to know if your code contains some big secret you need to hide at all costs. I would probably not do it if I wanted to charge big money for any bug solving in the deployed product. I could use it if client has good compliance standards and is not a potential competitor, nor is expected to pay for more licenses.
While I've done this once, I honestly would avoid doing it again.
Regarding the C code, if you can compile it into executables and/or shared libraries can be included in the executable generated by Pyinstaller.
Which framework has the most mature, flexible, intergrated, centralized and easy-to-use plugins/extension system.
My main requirements are:
a centralized system/repository where i could find a extension i need
no need to make changes in the source code, the plugin should be easily enabled and disabled
large plugin/extension database
something like http://wordpress.org/extend/plugins/
http://www.symfony-project.org/plugins/
I can't speak for Django, but I can tell you about Rails' open source community. GitHub is the central location for all Rails open source code.
Most ruby libraries/plugins these days are packaged as "gems", which are easy to install, update, and remove. RubyGems is the place to go for these pre-packaged gems, when you care less about the code and more about dropping the functionality into your application.
There is now a new tool called RVM that keeps the gems (and even rails version) isolated from one application to the next, on your system. That way if one app uses version 1.0 of a gem, and another uses version 2.0, they don't conflict with each other.
All in all, a pretty sweet setup.
There are lots of reusable django apps around. You can find many on the CheeseShop, but even more on GitHub and BitBucket.
There is also django-packages, which is a bit like the CheeseShop, but just for django packages.
VirtualEnv is like RVM (or rather, RVM is like VirtualEnv), which is a great way to isolate your python packages (I even use it in production). It has been around for ages, and works well with pip (the best python package installer).
Both of them are mature frameworks. I don't use ruby so I don't know about the rails plugin land. Given how popular it is (and my information from my lurking time on local Ruby lists), it's pretty good.
With Django, you have (like Matthew mentioned) django-packages and a few other places. I've been working on a largish Django project and it's pretty easy to just search for something like "django facebook" on google and get what you need. The Pinax project is an integrated collection of Django apps that lets you have most things out of the box. That's another thing you might want to consider. The packaging of the plugins are using the standard Python distutils libraries so installation is a single command (or if you're using pip/virtualenv, directly off the net).
VirtualEnv and related tools are not really Django specific. They're good practice if you're doing any python development though.
You should take a step back and evaluate both languages as well in my opinion. Python and Ruby are quite different in their approach to good code and it's likely that one will fit your brain better than the other.
I currently work with .NET exclusively and would like to have a go at python. To this end I need to set up a python development environment. I guide to this would be handy. I guess I would be doing web development so will need a web server and probably a database. I also need pointers to popular ORM's, an MVC framework, and a testing library.
One of my main criteria with all this is that I want to understand how it works, and I want it to be as isolated as possible. This is important as i am wary of polluting what is a working .NET environment with 3rd party web and database servers. I am perfectly happy using SQLite to start with if this is possible.
If I get on well with this I am also likely to want to set up automated build and ci server (On a virtual machine, probably ubuntu). Any suggestions for these would be useful.
My ultimate aim if i like python is to have similar sorts of tools that i have available with .NET and to really understand the build and deployment of it all. To start with I will settle for a simple development environment that is as isolated as possible and will be easy to remove if I don't like it. I don't want to use IronPython as I want the full experience of developing a python solution using the tools and frameworks that are generally used.
It's not that hard to set up a Python environment, and I've never had it muck up my .NET work. Basically, install Python --- I'd use 2.6 rather than 3.0, which is not yet broadly accepted --- and add it to your PATH, and you're ready to go with the language. I wouldn't recommend using a Ubuntu VM as your development environment; if you're working on Windows, you might as well develop on Windows, and I've had no significant problems doing so. I go back and forth from Windows to Linux with no trouble.
If you have an editor that you're comfortable with that has basic support for Python, I'd stick with it. If not, I've found Geany to be a nice, light, easy-to-use editor with good Python support, though I use Emacs myself because I know it; other people like SCITE, NotePad++, or any of a slew of others. I'd avoid fancy IDEs for Python, because they don't match the character of the language, and I wouldn't bother with IDLE (included with Python), because it's a royal pain to use.
Suggestions for libraries and frameworks:
Django is the standard web framework, but it's big and you have to work django's way; I prefer CherryPy, which is also actively supported, but is light, gives you great freedom, and contains a nice, solid webserver that can be replaced easily with httpd.
Django includes its own ORM, which is nice enough; there's a standalone one for Python, though, which is even nicer: SQL Alchemy
As far as a testing library goes, pyunit seems to me to be the obvious choice
Good luck, and welcome to a really fun language!
EDIT summary: I originally recommended Karrigell, but can't any more: since the 3.0 release, it's been continuously broken, and the community is not large enough to solve the problems. CherryPy is a good substitute if you like a light, simple framework that doesn't get in your way, so I've changed the above to suggest it instead.
Well, if you're thinking of setting up an Ubuntu VM anyway, you might as well make that your development environment. Then you can install Apache and MySQL or Postgres on that VM just via the standard packaging tools (apt-get install), and there's no danger of polluting your Windows environment.
You can either do the actual development on your Windows machine via your favourite IDE, using the VM as a networked drive and saving the code there, or you can just use the VM as a full desktop environment and do everything there, which is what I would recommend.
Install the pre-configured ActivePython release from activestate.
Among other features, it includes the PythonWin IDE (Windows only) which makes it easy to explore Python interactively.
The recommended reference is Dive Into Python, mentioned many times on similar SO discussions.
You should install python 2.4, python 2.5, python 2.6 and python 3.0, and add to your path the one you use more often (Add c:\Pythonxx\ and c:\Pythonxx\Scripts).
For every python 2.x, install easy_install; Download ez_setup.py and then from the cmd:
c:\Python2x\python.exe x:\path\to\ez_setup.py
c:\Python2x\Scripts\easy_install virtualenv
Then each time you start a new project create a new virtual environment to isolate the specific package you needs for your project:
mkdir <project name>
cd <project name>
c:\Python2x\Scripts\virtualenv --no-site-packages .\v
It creates a copy of python and its libraries in .v\Scripts and .\v\Lib. Every third party packages you install in that environment will be put into .\v\Lib\site-packages. The -no-site-packages don't give access to the global site package, so you can be sure all your dependencies are in .\v\Lib\site-packages.
To activate the virtual environment:
.\v\Scripts\activate
For the frameworks, there are many. Django is great and very well documented but you should probably look at Pylons first for its documentions on unicode, packaging, deployment and testing, and for its better WSGI support.
For the IDE, Python comes with IDLE which is enough for learning, however you might want to look at Eclipse+PyDev, Komodo or Wingware Python IDE. Netbean 6.5 has beta support for python that looks promising (See top 5 python IDE).
For the webserver, you don't need any; Python has its own and all web framework come with their own. You might want to install MySql or ProgreSql; it's often better to develop on the same DB you will use for production.
Also, when you have learnt Python, look at Foundations of Agile Python Development or Expert Python Programming.
Using Python on Windows
SO: Python tutorial for total beginners?
Take a look at Pylons, read about WSGI and Paste.
There's nice introductory Google tech talk about them: ReUsable Web Components with Python and Future Python Web Development.
Here's my answer to similar question:
Django vs other Python web frameworks?
NOTE: I included a lot of links to frameworks, projects and what-not, but as a new user I was limited to 1 link per answer. If someone else with enough reputation to edit wants/can edit them into this answer instead of the footnotes, I'd be grateful.
There are some Python IDE's such as Wing IDE[1], I believe some people use Eclipse[2] with a python plugin[3] as well. A lot of people in the #python channel of FreeNode seem to prefer vim, emacs, nano and similar text editors in favor of IDE's. My personal preffered editor is Vim, but if you've mostly done .NET development on windows, presumably with the usual Visual X IDE's, vim and emacs will probably cause you culture shock and you'd be better of using an IDE.
Nearly all python web frameworks* support the WSGI standard[4], most of the large web servers have some sort of plugin to support WSGI, the others support WSGI via fast cgi or plain cgi.
The Zope[5] and Django[6] frameworks have their own ORM's, of other ORM's the two most well known appear to be SQL Alchemy[7] and SQL Object[8]. I only have experience with the former, but both support all possible sane database choices, including SQLite which is installed together with Python and hence perfectly suited to testing and experimenting without polluting your .NET environment with 3rd part web servers and database servers.
The builtin unittest[9] and pyunit[10] frameworks seem to be the preffered solutions for unit testing, but I don't have much experience with these.
bpython[11] and ipython[12] offer enhanced interactive python shells which can greatly help speed up and testing small bits of code and hence worth looking in to.
As for a list of well known and often used web frameworks, look into the following frameworks**:
Twisted[13] is a generic networking framework, which supports almost every single protocol under the sun.
Pylons[14] is light-weight framework aimed at being as flexible as possible and leaving all the choices about what ORM, templating language and what-not to you.
CherryPy[15] tries to provide an interface to expose Python objects to the web.
Django[6] attempts to be an all-in-one solution, builtin template system, ORM, admin pages and internationalization. While the previous frameworks have more DIY wiring together various frameworks work involved with them.
Zope[5] is aimed to be suitable for large enterprise applications, I've heard nothing but good things about it, but consensus seems to be that for smaller you're probably better off with one of the simpler and smaller frameworks.
TurboGears[16] is the framework I know the least about, but it seems to be mostly competition for Django.
This is everything I can think of right now, I'll edit and add stuff if I can think of it. I hope this helps you some in the wonderful world of python.
* - The main exception would be Apache's mod_python, which you should avoid for exactly that reason, use mod_wsgi instead.
** - Word of warning, I have not personally used these frameworks this is just a very short impression I have gotten from talking to other people about each framework, it may be wildly inaccurate. (If anyone has any corrections, do comment and I'll try to edit and fix this answer).
(The http:// is missing since they're recognized as links otherwise)
[1] www.wingware.com/
[2] www.eclipse.org/
[3] pydev.sourceforge.net/
[4] wsgi.org/wsgi/
[5] www.zope.org/
[6] www.djangoproject.com/
[7] www.sqlalchemy.org/
[8] www.sqlobject.org/
[9] docs.python.org/library/unittest.html
[10] pyunit.sourceforge.net/pyunit.html
[11] www.bpython-interpreter.org/
[12] ipython.scipy.org/
[13] twistedmatrix.com/trac/
[14] pylonshq.com/
[15] www.cherrypy.org/
[16] turbogears.org/
Environment?
Here is the simplest solution:
Install Active Python 2.6. Its the Python itself, but comes with some extra handy useful stuff, like DiveintoPython chm.
Use Komodo Edit 5. It is among the good free editor you can use for Python.
Use IDLE. Its the best simplest short snippet editor, with syntax highlighting and auto complete unmatched by most other IDEs. It comes bundled with python.
Use Ipython. Its a shell that does syntax highlighting and auto complete, bash functions, pretty print, logging, history and many such things.
Install easy_install and/or pip for installing various 3rd party apps easily.
Coming from Visual Studio and .Net it will sound a lot different, but its an entirely different world.
For the framework, django works the best. Walk thro the tutorial and you will be impressed enough. The documentation rocks. The community, you have to see for yourself, to know how wonderful it is!!
Python has build in SQL like database and web server, so you wouldn't need to install any third party apps. Remember Python comes with batteries included.
If you've worked with Eclipse before you could give Pydev a try
I'm wondering if there's such a thing as Django-like ease of web app development combined with good deployment, debugging and other tools?
Django is a very productive framework for building content-heavy sites; the best I've tried and a breath of fresh air compared to some of the Java monstrosities out there. However it's written in Python which means there's little real support in the way of deployment/packaging, debugging, profilers and other tools that make building and maintaining applications much easier.
Ruby has similar issues and although I do like Ruby much better than I like Python, I get the impression that Rails is roughly in the same boat at Django when it comes to managing/supporting the app.
Has anyone here tried both Django and Grails (or other web frameworks) for non-trivial projects? How did they compare?
You asked for someone who used both Grails and Django. I've done work on both for big projects. Here's my Thoughts:
IDE's:
Django works really well in Eclipse, Grails works really well in IntelliJ Idea.
Debugging:
Practically the same (assuming you use IntelliJ for Grails, and Eclipse for Python). Step debugging, inspecting variables, etc... never need a print statement for either. Sometimes django error messages can be useless but Grails error messages are usually pretty lengthy and hard to parse through.
Time to run a unit test:
django: 2 seconds.
Grails: 20 seconds (the tests themselves both run in a fraction of a second, it's the part about loading the framework to run them that takes the rest... as you can see, Grails is frustratingly slow to load).
Deployment:
Django: copy & paste one file into an apache config, and to redeploy, just change the code and reload apache.
Grails: create a .war file, deploy it on tomcat, rinse and repeat to redeploy.
Programming languages:
Groovy is TOTALLY awesome. I love it, more so than Python. But I certainly have no complaints.
Plugins:
Grails: lots of broken plugins (and can use every java lib ever).
Django: a few stable plugins, but enough to do most of what you need.
Database:
Django: schema migrations using South, and generally intuitive relations.
Grails: no schema migrations, and by default it deletes the database on startup... WTF
Usage:
Django: startups (especially in the Gov 2.0 space), independent web dev shops.
Grails: enterprise
Hope that helps!
However it's written in Python which
means there's little real support in
the way of deployment/packaging,
debugging, profilers and other tools
that make building and maintaining
applications much easier.
Python has:
a great interactive debugger, which makes very good use of Python REPL.
easy_install anv virtualenv for dependency management, packaging and deployment.
profiling features comparable to other languages
So IMHO you shouldn't worry about this things, use Python and Django and live happily :-)
Lucky for you, newest version of Django runs on Jython, so you don't need to leave your whole Java ecosystem behind.
Speaking of frameworks, I evaluated this year:
Pylons (Python)
webpy (Python)
Symfony (PHP)
CakePHP (PHP)
None of this frameworks comes close to the power of Django or Ruby on Rails. Based on my collegue opinion I could recommend you kohana framework. The downside is, it's written in PHP and, as far as I know, PHP doesn't have superb tools for debugging, profiling and packaging of apps.
Edit: Here is a very good article about packaging and deployment of Python apps (specifically Django apps). It's a hot topic in Django community now.
The statement that grails deletes the database on start-up is completely wrong. It's behavior on start-up is completely configurable and easy to configure. I generally use create-drop when running an app in dev mode. I use update when I run in test and production.
I also love the bootstrap processing that lets me pre-configure test users, data, etc by environment in Grails.
I'd love to see someone who has really built and deployed some commercial projects comment on the pros / cons. Be a really interesting read.
Grails.
Grails just looks like Rails (Ruby),but it uses groovy which is simpler than java. It uses java technology and you can use any java lib without any trouble.
I also choose Grails over simplicity and there are lots of java lib (such as jasper report, jawr etc) and I am glad that now they join with SpringSource which makes their base solid.
I have two friends who originally started writing an application using Ruby on Rails, but ran into a number of issues and limitations. After about 8 weeks of working on it, they decided to investigate other alternatives.
They settled on the Catalyst Framework, and Perl. That was about 4 months ago now, and they've repeatedly talked about how much better the application is going, and how much more flexibility they have.
With Perl, you have all of CPAN available to you, along with the large quantity of tools included. I'd suggest taking a look at it, at least.
The "good deployment" issue -- for Python -- doesn't have the Deep Significance that it has for Java.
Python deployment for Django is basically "move the files". You can run straight out of the subversion trunk directory if you want to.
You can, without breaking much of a sweat, using the Python distutils and build yourself a distribution kit that puts your Django apps into Python's site-packages. I'm not a big fan of it, but it's really easy to do.
Since my stuff runs in Linux, I have simple "install.py" scripts that move stuff out of the Subversion directories into /opt/this and /opt/that directories. I use an explicit path settings in my Apache configuration to name those directories where the applications live.
Patching can be done by editing the files in place. (A bad policy.) I prefer to edit in the SVN location and rerun my little install to be sure I actually have all the files under control.
cakephp.org
Cakephp is really good, really close to ruby on rails (1.2). It is in php, works very well on shared hosts and is easy to implement.
The only downside is that the documentation is somewhat lacking, but you quickly get it and quickly start doing cool stuff.
I totally recommend cakephp.
Personally I made some rather big projects with Django, but I can compare only with said "montrosities" (Spring, EJB) and really low-level stuff like Twisted.
Web frameworks using interpreted languages are mostly in its infancy and all of them (actively maintained, that is) are getting better with every day.
By "good deployment" are you comparing it with Java's EAR files, which allow you to deploy web applications by uploading a single file to a J2EE server? (And, to a lesser extent, WAR files; EAR files can have WAR files for dependent projects)
I don't think Django or Rails have gotten quite to that point yet, but I could be wrong... zuber pointed out an article with more details on the Python side.
Capistrano may help out on the Ruby side.
Unfortunately, I haven't really worked with either Python or Ruby that much, so I can't help out on profilers or debuggers.