How to provide a sensitive data to nose - python

I am using nose to test a Python class that requires a username and password. What is the best practice for providing that username and password pair to the test module? I'd like the testing process to be intuitive for another engineer to pick up, and I cannot store any sensitive information in cleartext. Any ideas?

Most common options is to provide data in env variables.
import os
password = os.environ['PASSWORD']
username = os.environ['USERNAME']
$ USERNAME=user PASSWORD=qwerty nose
Note: Environment variables can be read by root and process owner. Variables provided via command line appear in shell history with the command.
Manual tests
encourage (enforce) developers to obtain their own credentials,
encrypt credentials; among others I would take a look at git-secret. Tox could be used to automate this a bit.
Automated tests
Most CI systems offer option (internally or via plugin) to store and pass credentials to test/build plans, mask them in logs. For example Jenkins has numerous plugins Mask Password, Credentials Binding. Atlassian's Bamboo by default masks all plan variables if name has password word. Personally, I found the most useful are Travis' encrypted variables, credentials are tied to repo/commit, not to plan.

Related

Is there a way for me to hide API keys from users but still allow them to run a script? [duplicate]

I keep important settings like the hostnames and ports of development and production servers in my version control system. But I know that it's bad practice to keep secrets (like private keys and database passwords) in a VCS repository.
But passwords--like any other setting--seem like they should be versioned. So what is the proper way to keep passwords version controlled?
I imagine it would involve keeping the secrets in their own "secrets settings" file and having that file encrypted and version controlled. But what technologies? And how to do this properly? Is there a better way entirely to go about it?
I ask the question generally, but in my specific instance I would like to store secret keys and passwords for a Django/Python site using git and github.
Also, an ideal solution would do something magical when I push/pull with git--e.g., if the encrypted passwords file changes a script is run which asks for a password and decrypts it into place.
EDIT: For clarity, I am asking about where to store production secrets.
You're exactly right to want to encrypt your sensitive settings file while still maintaining the file in version control. As you mention, the best solution would be one in which Git will transparently encrypt certain sensitive files when you push them so that locally (i.e. on any machine which has your certificate) you can use the settings file, but Git or Dropbox or whoever is storing your files under VC does not have the ability to read the information in plaintext.
Tutorial on Transparent Encryption/Decryption during Push/Pull
This gist https://gist.github.com/873637 shows a tutorial on how to use the Git's smudge/clean filter driver with openssl to transparently encrypt pushed files. You just need to do some initial setup.
Summary of How it Works
You'll basically be creating a .gitencrypt folder containing 3 bash scripts,
clean_filter_openssl
smudge_filter_openssl
diff_filter_openssl
which are used by Git for decryption, encryption, and supporting Git diff. A master passphrase and salt (fixed!) is defined inside these scripts and you MUST ensure that .gitencrypt is never actually pushed.
Example clean_filter_openssl script:
#!/bin/bash
SALT_FIXED=<your-salt> # 24 or less hex characters
PASS_FIXED=<your-passphrase>
openssl enc -base64 -aes-256-ecb -S $SALT_FIXED -k $PASS_FIXED
Similar for smudge_filter_open_ssl and diff_filter_oepnssl. See Gist.
Your repo with sensitive information should have a .gitattribute file (unencrypted and included in repo) which references the .gitencrypt directory (which contains everything Git needs to encrypt/decrypt the project transparently) and which is present on your local machine.
.gitattribute contents:
* filter=openssl diff=openssl
[merge]
renormalize = true
Finally, you will also need to add the following content to your .git/config file
[filter "openssl"]
smudge = ~/.gitencrypt/smudge_filter_openssl
clean = ~/.gitencrypt/clean_filter_openssl
[diff "openssl"]
textconv = ~/.gitencrypt/diff_filter_openssl
Now, when you push the repository containing your sensitive information to a remote repository, the files will be transparently encrypted. When you pull from a local machine which has the .gitencrypt directory (containing your passphrase), the files will be transparently decrypted.
Notes
I should note that this tutorial does not describe a way to only encrypt your sensitive settings file. This will transparently encrypt the entire repository that is pushed to the remote VC host and decrypt the entire repository so it is entirely decrypted locally. To achieve the behavior you want, you could place sensitive files for one or many projects in one sensitive_settings_repo. You could investigate how this transparent encryption technique works with Git submodules http://git-scm.com/book/en/Git-Tools-Submodules if you really need the sensitive files to be in the same repository.
The use of a fixed passphrase could theoretically lead to brute-force vulnerabilities if attackers had access to many encrypted repos/files. IMO, the probability of this is very low. As a note at the bottom of this tutorial mentions, not using a fixed passphrase will result in local versions of a repo on different machines always showing that changes have occurred with 'git status'.
Heroku pushes the use of environment variables for settings and secret keys:
The traditional approach for handling such config vars is to put them under source - in a properties file of some sort. This is an error-prone process, and is especially complicated for open source apps which often have to maintain separate (and private) branches with app-specific configurations.
A better solution is to use environment variables, and keep the keys out of the code. On a traditional host or working locally you can set environment vars in your bashrc. On Heroku, you use config vars.
With Foreman and .env files Heroku provide an enviable toolchain to export, import and synchronise environment variables.
Personally, I believe it's wrong to save secret keys alongside code. It's fundamentally inconsistent with source control, because the keys are for services extrinsic to the the code. The one boon would be that a developer can clone HEAD and run the application without any setup. However, suppose a developer checks out a historic revision of the code. Their copy will include last year's database password, so the application will fail against today's database.
With the Heroku method above, a developer can checkout last year's app, configure it with today's keys, and run it successfully against today's database.
The cleanest way in my opinion is to use environment variables. You won't have to deal with .dist files for example, and the project state on the production environment would be the same as your local machine's.
I recommend reading The Twelve-Factor App's config chapter, the others too if you're interested.
I suggest using configuration files for that and to not version them.
You can however version examples of the files.
I don't see any problem of sharing development settings. By definition it should contain no valuable data.
An option would be to put project-bound credentials into an encrypted container (TrueCrypt or Keepass) and push it.
Update as answer from my comment below:
Interesting question btw. I just found this: github.com/shadowhand/git-encrypt which looks very promising for automatic encryption
Since asking this question I have settled on a solution, which I use when developing small application with a small team of people.
git-crypt
git-crypt uses GPG to transparently encrypt files when their names match certain patterns. For intance, if you add to your .gitattributes file...
*.secret.* filter=git-crypt diff=git-crypt
...then a file like config.secret.json will always be pushed to remote repos with encryption, but remain unencrypted on your local file system.
If I want to add a new GPG key (a person) to your repo which can decrypt the protected files then run git-crypt add-gpg-user <gpg_user_key>. This creates a new commit. The new user will be able to decrypt subsequent commits.
BlackBox was recently released by StackExchange and while I have yet to use it, it seems to exactly address the problems and support the features requested in this question.
From the description on https://github.com/StackExchange/blackbox:
Safely store secrets in a VCS repo (i.e. Git or Mercurial). These
commands make it easy for you to GPG encrypt specific files in a repo
so they are "encrypted at rest" in your repository. However, the
scripts make it easy to decrypt them when you need to view or edit
them, and decrypt them for for use in production.
I ask the question generally, but in my specific instance I would like
to store secret keys and passwords for a Django/Python site using git
and github.
No, just don't, even if it's your private repo and you never intend to share it, don't.
You should create a local_settings.py put it on VCS ignore and in your settings.py do something like
from local_settings import DATABASES, SECRET_KEY
DATABASES = DATABASES
SECRET_KEY = SECRET_KEY
If your secrets settings are that versatile, I am eager to say you're doing something wrong
EDIT: I assume you want to keep track of your previous passwords versions - say, for a script that would prevent password reusing etc.
I think GnuPG is the best way to go - it's already used in one git-related project (git-annex) to encrypt repository contents stored on cloud services. GnuPG (gnu pgp) provides a very strong key-based encryption.
You keep a key on your local machine.
You add 'mypassword' to ignored files.
On pre-commit hook you encrypt the mypassword file into the mypassword.gpg file tracked by git and add it to the commit.
On post-merge hook you just decrypt mypassword.gpg into mypassword.
Now if your 'mypassword' file did not change then encrypting it will result with same ciphertext and it won't be added to the index (no redundancy). Slightest modification of mypassword results in radically different ciphertext and mypassword.gpg in staging area differs a lot from the one in repository, thus will be added to the commit. Even if the attacker gets a hold of your gpg key he still needs to bruteforce the password. If the attacker gets an access to remote repository with ciphertext he can compare a bunch of ciphertexts, but their number won't be sufficient to give him any non-negligible advantage.
Later on you can use .gitattributes to provide an on-the-fly decryption for quit git diff of your password.
Also you can have separate keys for different types of passwords etc.
Usually, i seperate password as a config file. and make them dist.
/yourapp
main.py
default.cfg.dist
And when i run main.py, put the real password in default.cfg that copied.
ps. when you work with git or hg. you can ignore *.cfg files to make .gitignore or .hgignore
Provide a way to override the config
This is the best way to manage a set of sane defaults for the config you checkin without requiring the config be complete, or contain things like hostnames and credentials. There are a few ways to override default configs.
Environment variables (as others have already mentioned) are one way of doing it.
The best way is to look for an external config file that overrides the default config values. This allows you to manage the external configs via a configuration management system like Chef, Puppet or Cfengine. Configuration management is the standard answer for the management of configs separate from the codebase so you don't have to do a release to update the config on a single host or a group of hosts.
FYI: Encrypting creds is not always a best practice, especially in a place with limited resources. It may be the case that encrypting creds will gain you no additional risk mitigation and simply add an unnecessary layer of complexity. Make sure you do the proper analysis before making a decision.
Encrypt the passwords file, using for example GPG. Add the keys on your local machine and on your server. Decrypt the file and put it outside your repo folders.
I use a passwords.conf, located in my homefolder. On every deploy this file gets updated.
No, private keys and passwords do not fall under revision control. There is no reason to burden everyone with read access to your repository with knowing sensitive service credentials used in production, when most likely not all of them should have access to those services.
Starting with Django 1.4, your Django projects now ship with a project.wsgi module that defines the application object and it's a perfect place to start enforcing the use of a project.local settings module that contains site-specific configurations.
This settings module is ignored from revision control, but it's presence is required when running your project instance as a WSGI application, typical for production environments. This is how it should look like:
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "project.local")
# This application object is used by the development server
# as well as any WSGI server configured to use this file.
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
Now you can have a local.py module who's owner and group can be configured so that only authorized personnel and the Django processes can read the file's contents.
If you need VCS for your secrets you should at least keep them in a second repository seperated from you actual code. So you can give your team members access to the source code repository and they won't see your credentials. Furthermore host this repository somewhere else (eg. on your own server with an encrypted filesystem, not on github) and for checking it out to the production system you could use something like git-submodule.
This is what I do:
Keep all secrets as env vars in $HOME/.secrets (go-r perms) that $HOME/.bashrc sources (this way if you open .bashrc in front of someone, they won't see the secrets)
Configuration files are stored in VCS as templates, such as config.properties stored as config.properties.tmpl
The template files contain a placeholder for the secret, such as:
my.password=##MY_PASSWORD##
On application deployment, script is ran that transforms the template file into the target file, replacing placeholders with values of environment variables, such as changing ##MY_PASSWORD## to the value of $MY_PASSWORD.
Another approach could be to completely avoid saving secrets in version control systems and instead use a tool like vault from hashicorp, a secret storage with key rolling and auditing, with an API and embedded encryption.
You could use EncFS if your system provides that. Thus you could keep your encrypted data as a subfolder of your repository, while providing your application a decrypted view to the data mounted aside. As the encryption is transparent, no special operations are needed on pull or push.
It would however need to mount the EncFS folders, which could be done by your application based on an password stored elsewhere outside the versioned folders (eg. environment variables).

Learning Python fast how can I protect some private connections from been exposed

Hi I'm new to the community and new to Python, experienced but rusty on other high level languages, so my question is simple.
I made a simple script to connect to a private ftp server, and retrieve daily information from it.
from ftplib import FTP
#Open ftp connection
#Connect to server to retrieve inventory
#Open ftp connection
def FTPconnection(file_name):
ftp = FTP('ftp.serveriuse.com')
ftp.login('mylogin', 'password')
#List the files in the current directory
print("Current File List:")
file = ftp.dir()
print(file)
# # #Get the latest csv file from server
# ftp.cwd("/pub")
gfile = open(file_name, "wb")
ftp.retrbinary('RETR '+ file_name, gfile.write)
gfile.close()
ftp.quit()
FTPconnection('test1.csv')
FTPconnection('test2.csv')
That's the whole script, it passes my credentials, and then calls the function FTPconnection on two different files I'm retrieving.
Then my other script that processes them has an import statement, as I tried to call this script as a module, what my import does it's just connect to the FTP server and fetch information.
import ftpconnect as ftpc
This is the on the other Python script, that does the processing.
It works but I want to improve it, so I need some guidance on best practices about how to do this, because in Spyder 4.1.5 I get an 'Module ftpconnect called but unused' warning ... so probably I am missing something here, I'm developing on MacOS using Anaconda and Python 3.8.5.
I'm trying to build an app, to automate some tasks, but I couldn't find anything about modules that guided me to better code, it simply says you have to import whatever .py file name you used and that will be considered a module ...
and my final question is how can you normally protect private information(ftp credentials) from being exposed? This has nothing to do to protect my code but the credentials.
There are a few options for storing passwords and other secrets that a Python program needs to use, particularly a program that needs to run in the background where it can't just ask the user to type in the password.
Problems to avoid:
Checking the password in to source control where other developers or even the public can see it.
Other users on the same server reading the password from a configuration file or source code.
Having the password in a source file where others can see it over your shoulder while you are editing it.
Option 1: SSH
This isn't always an option, but it's probably the best. Your private key is never transmitted over the network, SSH just runs mathematical calculations to prove that you have the right key.
In order to make it work, you need the following:
The database or whatever you are accessing needs to be accessible by SSH. Try searching for "SSH" plus whatever service you are accessing. For example, "ssh postgresql". If this isn't a feature on your database, move on to the next option.
Create an account to run the service that will make calls to the database, and generate an SSH key.
Either add the public key to the service you're going to call, or create a local account on that server, and install the public key there.
Option 2: Environment Variables
This one is the simplest, so it might be a good place to start. It's described well in the Twelve Factor App. The basic idea is that your source code just pulls the password or other secrets from environment variables, and then you configure those environment variables on each system where you run the program. It might also be a nice touch if you use default values that will work for most developers. You have to balance that against making your software "secure by default".
Here's an example that pulls the server, user name, and password from environment variables.
import os
server = os.getenv('MY_APP_DB_SERVER', 'localhost')
user = os.getenv('MY_APP_DB_USER', 'myapp')
password = os.getenv('MY_APP_DB_PASSWORD', '')
db_connect(server, user, password)
Look up how to set environment variables in your operating system, and consider running the service under its own account. That way you don't have sensitive data in environment variables when you run programs in your own account. When you do set up those environment variables, take extra care that other users can't read them. Check file permissions, for example. Of course any users with root permission will be able to read them, but that can't be helped. If you're using systemd, look at the service unit, and be careful to use EnvironmentFile instead of Environment for any secrets. Environment values can be viewed by any user with systemctl show.
Option 3: Configuration Files
This is very similar to the environment variables, but you read the secrets from a text file. I still find the environment variables more flexible for things like deployment tools and continuous integration servers. If you decide to use a configuration file, Python supports several formats in the standard library, like JSON, INI, netrc, and XML. You can also find external packages like PyYAML and TOML. Personally, I find JSON and YAML the simplest to use, and YAML allows comments.
Three things to consider with configuration files:
Where is the file? Maybe a default location like ~/.my_app, and a command-line option to use a different location.
Make sure other users can't read the file.
Obviously, don't commit the configuration file to source code. You might want to commit a template that users can copy to their home directory.
Option 4: Python Module
Some projects just put their secrets right into a Python module.
# settings.py
db_server = 'dbhost1'
db_user = 'my_app'
db_password = 'correcthorsebatterystaple'
Then import that module to get the values.
# my_app.py
from settings import db_server, db_user, db_password
db_connect(db_server, db_user, db_password)
One project that uses this technique is Django. Obviously, you shouldn't commit settings.py to source control, although you might want to commit a file called settings_template.py that users can copy and modify.
I see a few problems with this technique:
Developers might accidentally commit the file to source control. Adding it to .gitignore reduces that risk.
Some of your code is not under source control. If you're disciplined and only put strings and numbers in here, that won't be a problem. If you start writing logging filter classes in here, stop!
If your project already uses this technique, it's easy to transition to environment variables. Just move all the setting values to environment variables, and change the Python module to read from those environment variables.

Alternative to attempting to persist Environment Variables in Python

Up until now, whenever I have needed to store a "secret" for a simple python application, I have relied on environment variables. In Windows, I set the variables via the Computer Properties dialog and I access them in my Python code like this:
database_password = os.environ['DB_PASS']
The simplicity of this approach has served me well. Now I have a project that uses Oauth2 authentication and I have a need to store tokens to the environment that may change throughout program execution. I want them to persist the next time I execute the program. This is what I have come up with:
#fetch a new token
token = oauth.fetch_token('https://api.example.com/oauth/v2/token', code=secretcode)
access_token = token['access_token']
#make sure it persists in the current session
os.environ['TOKEN'] = access_token
#store to the system environment (Windows)
cmd = 'SETX /M TOKEN ' + access_token
os.system(cmd)
It gets the job done quickly for me today, but does not seem like the right approach to add to my toolbox. Does anyone have a more elegant way of doing what I am trying to do that does not add too many layers of complexity? If the solution worked across platforms that would be a bonus.
I have used the Python keyring module with great success. It's an interface to credential vaults provided by the operating system (e.g., Windows Credential Manager). I haven't used it on Linux, but it appears to be supported, as well.
Storing a password/token and then retrieving it can be as simple as:
import keyring
keyring.set_password("system", "username", "password")
keyring.get_password("system", "username")

How to avoid storing the username and password as part of the connection string

Situation
My team works with a NFS. I've written a python script that connects to a database. People execute my python script and based some values associated with their userid in our backend mongodb database the script performs certain actions for them
My question
I have a mongodb connection string like so
database_client = MongoClient("mongodb://<myusername>:<mypassword>#serverip:port/DBName")
The problem is that anyone can simply read the py script and figure out what my username and password are.
What is not an option
Making the Python script and opaque executable is not allowed. Not to mention it doesn't protect my credentials from leaking to other members of the tram who work on the same codebase. Nor is making the db world editable/readable. All requests to edit the db must go through an authorized account and only through the API provided by the script.
Is there a way to do this/what should I be doing instead?
Edit: I am NOT using Heroku. I'm behind a company firewall and the mongoDB server is a machine with an IP on the company network
One idea is to read the username and password from a configuration file1 such as
[topsecret]
user = foo
password = bar
This way you can share the code with anyone and they will be able to execute it only if they have the valid configuration file.
1 You can use the configparser module https://docs.python.org/3.4/library/configparser.html to parse configuration files such as above.

How is python-keyring implemented on Windows?

How does python-keyring provide security on Windows?
In GNOME/KDE on Linux, the user is prompted to enter his password to authorize access to the keyring on a per-application basis.
In Windows there is no such prompt when an application accesses the keyring. What is stopping a random python application to retrieve a password from the keyring by running
import keyring
get_password(service, username)
How is user consent implemented? Is the whole concept, atleast in Windows, based on the assumption that all installed programs are 'trusted'?
Researching this a bit, it appears that the passwords are stored within a Windows Credential Vault, which is the equivalent of the Gnome or KDE keyrings. You can actually see the ones that you have stored by opening up the Windows Credential Manager. I get there by just typing in Credential Manager on Windows 8.1 from the start screen, but I think you can get to it from the User accounts page as well.
Anyway, as you can see from the attached image, the password that I added to the keyring as a test is displayed under Windows Credentials -> Generic Credentials -> keyring_demo. Opening this window up as another user on the PC does not show this password, so it seems secured from other Users. This screen also allows you to revoke or change passwords.
As to how consent is implemented, I believe keyring will operate as long as your Windows user account is logged in, but I don't know the specifics.
the cedential manager method works, but in my case add:
internet or network addess "myPassGroup"
username "pass1"
password "xxx"
then add another entry using the same network address
internet or netwokr address "myPassGroup"
username "pass2"
password "xxx"
the pass2 will OVERRIDE the frist entry pass1!
this is a major drewback, as the "internet or network address" is
served as a groupname in keyring, I need put mutiple password under
the same name
my solution is to use the python command direct
open CMD in windows
type Python
then type import keyring
then type keyring.set_password("groupName", "passKey" ,"password")
then type keyring.set_password("groupName", "passKey2" ,"password2")
you can validate the result by
keying.get_password("groupname", "passKey")
keying.get_password("groupname", "passKey2")
I konw this will work, but still struggle to find where the actual data
is saved
I used the following command try to find out
python -c "import keyring.util.platform_; print(keyring.util.platform_.config_root())"
python -c "import keyring.util.platform_; print(keyring.util.platform_.data_root())"
the data_root in my case is "C:\Users\JunchenLiu\AppData\Local\Python Keyring"
I checked the folder, it doesn't exists... it must been saved somewhere. maybe someone can figure it out.
but my solution should work prefectly on Windows
This is from the python-keyring github I imagine a similar concern exists for windows as does MacOS though the website says no analysis has been completed
Security Considerations
Each builtin backend may have security considerations to understand before using this library. Authors of tools or libraries utilizing keyring are encouraged to consider these concerns.
As with any list of known security concerns, this list is not exhaustive. Additional issues can be added as needed.
macOS Keychain
Any Python script or application can access secrets created by keyring from that same Python executable without the operating system prompting the user for a password. To cause any specific secret to prompt for a password every time it is accessed, locate the credential using the Keychain Access application, and in the Access Control settings, remove Python from the list of allowed applications.
from keyring.backend import KeyringBackend
class SimpleKeyring(KeyringBackend):
"""Simple Keyring is a keyring which can store only one
password in memory.
"""
def __init__(self):
self.password = ''
def supported(self):
return 0
def get_password(self, service, username):
return self.password
def set_password(self, service, username, password):
self.password = password
return 0
def delete_password(self, service, username):
self.password = None

Categories