I want to encrypt a .zip file using AES256 in Python. I am aware of the Python cryptography module, in particular the example given at:
https://cryptography.io/en/latest/fernet/
However, I have needs that are a bit different:
I want to output binary data (because I want a small encrypted file). How can I output in binary instead of armored ASCII?
I do not want to have the plaintext timestamp. Any way to remove it?
If I cannot fix those points I will use another method. Any suggestions? I was considering issuing gpg commands through subprocess.
Looking at Fernet module, seems it encrypts and authenticates the data. Actually its safer than only encrypting (see here). However, removing the timestamp, in the case of this module, doesn't make sense if you also want to authenticate.
Said that, seems you want to risky and only encrypt instead of encrypt and authenticate. You might follow the examples of the same module found at https://cryptography.io/en/latest/hazmat/primitives/symmetric-encryption/. Just make sure this is what you really want.
As you're worried about size and want to use AES, you could try AES in CTR mode, which does not need padding, avoiding extra bytes at the end.
import os
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
backend = default_backend()
key = os.urandom(32)
nonce = os.urandom(16)
cipher = Cipher(algorithms.AES(key), modes.CTR(nonce), backend=backend)
encryptor = cipher.encryptor()
ct = encryptor.update(b"a secret message") + encryptor.finalize()
print(ct)
decryptor = cipher.decryptor()
print(decryptor.update(ct) + decryptor.finalize())
So, answering your questions:
(1) The update method already returns a byte array.
(2) This way there will be no plaintext data automatically appended to the ciphertext (but be aware of the security implications about not authenticating the data). However, you'll need to pass the IV anyway, what you would have to do in either case.
Related
I am having database credentials in my python code, which I would like to have it encrypted, use the value in run time by decrypting it.
I've found the below code with the help of stackoverflow and working as expected
from Crypto.Cipher import AES
import base64
msg_text = b'test some plain text here'.rjust(32)
secret_key = b'1234567890123456' # create new & store somewhere safe
cipher = AES.new(secret_key,AES.MODE_ECB) # never use ECB in strong systems obviously
encoded = base64.b64encode(cipher.encrypt(msg_text))
print(encoded)
# ...
decoded = cipher.decrypt(base64.b64decode(encoded))
print(decoded.strip())
Above code has secret_key and comment says to create new secret key.
How can I create a secret key and from where it can be created?
What would be the recommended place to store secret keys? Is there any structure/place that's recommended to save? I think it should be saved in database
Is above code the strong way of encrypting and decrypting? If it can be tampered, what way should be approached? Providing sample link would be a great help
Instead of hardcoding the password into source code, you can use a password and generate the keys by using PBKDF2 functions on the runtime.
A password should not be saved in the database, or in a file. You must keep in the memory.
The ECB mode is insecure, it leaks pattern on the data, see the penguin in Wikipedia. You should use CBC mode or CTR mode for encryption. However keep in mind that, while you can execute equality queries with ECB mode, you cannot execute with CBC or CTR mode. If the ECB mode suits your case, that is; the pattern is not a security issue, you can use ECB.
My server encrypts files using pycrypto with AES in CTR mode. My counter is a simple counter like this:
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03
I wanna decrypt the cipher text with c++'s cryptopp library in my clients. How should I do so?
Python code:
encryptor = AES.new(
CRYPTOGRAPHY_KEY,
AES.MODE_CTR,
counter=Counter.new(128),
)
cipher = encryptor.encrypt(plain_text)
C++ code so far:
byte ctr[] = "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01"
mDecryptor = new CryptoPP::CTR_Mode<CryptoPP::AES>::Decryption(key, 32, ctr);
std::string plain;
CryptoPP::StringSource(std::string(data, len), true, new CryptoPP::StreamTransformationFilter(*mDecryptor, new CryptoPP::StringSink(plain)));
but after running this plain is garbage.
Update:
Sample encrypted data you can try to decrypt with crypto++ so that you can help me even if you don't know python and you're just experienced with crypto++:
Try to decrypt this base64 encoded text:
2t0lLuSBY7NkfK5I4kML0qjcZl3xHcEQBPbDo4TbvQaXuUT8W7lNbRCl8hfSGJA00wgUXhAjQApcuTCZckb9e6EVOwsa+eLY78jo2CqYWzhGez9zn0D2LMKNmZQi88WuTFVw9r1GSKIHstoDWvn54zISmr/1JgjC++mv2yRvatcvs8GhcsZVZT8dueaNK6tXLd1fQumhXCjpMyFjOlPWVTBPjlnsC5Uh98V/YiIa898SF4dwfjtDrG/fQZYmWUzJ8k2AslYLKGs=
with this key:
12341234123412341234123412341234
with counter function described in the beginning of this post using crypto++. If you succeed post the decrypted text (which contains only numbers) and your solution please.
Update2:
I'm not providing an IV in python code, the python module ignores IV. I the IV thing is what causing the problem.
As I read their source codes I can say PyCrypto and Crypto++ Both are perfect libraries for cryptography for Python and C++. The problem was that I was prefixing the encrypted data with some meta information about file and I totally forgot about that, after handling these meta data in client Crypto++ decrypted my files.
As I didn't find this documented explicitly anywhere (not even in Wikipedia) I write it here:
Any combination of Nonce, IV and Counter like concatenation, xor, or likes will work for CTR mode, but the standard that most libraries implement is to concatenate these values in order. So the value that is used in block cipher algorithm is usually: Nonce + IV + Counter. And counter usually starts from 1 (not 0).
I'm developing a web app (using gevent, but that is not significant) that has to write some confidential information in log. The obvious idea is to encrypt the confidential information using a public key that is hard-coded into my application. To read it, one would need a private key, and 2048-bit RSA seems to be safe enough. I have chosen pycrypto (tried M2Crypto as well, but found nearly no differences for my purpose) and implemented log encryption as a logging.Formatter subclass. However, I'm new to pycrypto and cryptoraphy, and I am not sure my choice of the way my data is encrypted is reasonable. Is PKCS1_OAEP module what I need? Or there are more friendly ways of encryption without dividing the data in small chunks?
So, what I did is:
import logging
import sys
from Crypto.Cipher import PKCS1_OAEP as pkcs1
from Crypto.PublicKey import RSA
PUBLIC_KEY = """ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDe2mtK03UhymB+SrIbJJUwCPhWNMl8/gA9d7jex0ciSuFfShDaqJ4wYWG4OOl\
VqKMxPrPcZ/PMSwtc021yI8TXfgewb65H/YQw4JzzGANq2+mFT8jWRDn+xUc6vcWnXIG3OPg5DvIipGQvIPNIUUP3qE7yDHnS5xdVdFrVe2bUUXmZJ9\
0xJpyqlTuRtIgfIfEQC9cggrdr1G50tXdXZjS0M1WXl5P6599oH/ykjpDFrCnh5fz9WDwUc0mNJ+11Qh+yfDp3k7AhzhRaROKLVWnfkklFaFm7LsdVX\
KPjp7dPRcTb84c2OnlIjU0ykL74Fy0K3eaPvM6TLe/K1XuD3933 pupkin#pupkin"""
PUBLIC_KEY = RSA.importKey(PUBLIC_KEY)
LOG_FORMAT = '[%(asctime)-15s - %(levelname)s: %(message)s]'
# May be more, but there is a limit.
# I suppose, the algorithm requires enough padding,
# and size of padding depends on key length.
MAX_MSG_LEN = 128
# Size of a block encoded with padding. For a 2048-bit key seems to be OK.
ENCODED_CHUNK_LEN = 256
def encode_msg(msg):
res = []
k = pkcs1.new(PUBLIC_KEY)
for i in xrange(0, len(msg), MAX_MSG_LEN):
v = k.encrypt(msg[i : i+MAX_MSG_LEN])
# There are nicer ways to make a readable line from data than using hex. However, using
# hex representation requires no extra code, so let it be hex.
res.append(v.encode('hex'))
assert len(v) == ENCODED_CHUNK_LEN
return ''.join(res)
def decode_msg(msg, private_key):
msg = msg.decode('hex')
res = []
k = pkcs1.new(private_key)
for i in xrange(0, len(msg), ENCODED_CHUNK_LEN):
res.append(k.decrypt(msg[i : i+ENCODED_CHUNK_LEN]))
return ''.join(res)
class CryptoFormatter(logging.Formatter):
NOT_SECRET = ('CRITICAL',)
def format(self, record):
"""
If needed, I may encode only certain types of messages.
"""
try:
msg = logging.Formatter.format(self, record)
if not record.levelname in self.NOT_SECRET:
msg = encode_msg(logging.Formatter.format(self, record))
return msg
except:
import traceback
return traceback.format_exc()
def decrypt_file(key_fname, data_fname):
"""
The function decrypts logs and never runs on server. In fact,
server does not have a private key at all. The only key owner
is server admin.
"""
res = ''
with open(key_fname, 'r') as kf:
pkey = RSA.importKey(kf.read())
with open(data_fname, 'r') as f:
for l in f:
l = l.strip()
if l:
try:
res += decode_msg(l, pkey) + '\n'
except Exception: # A line may be unencrypted
res += l + '\n'
return res
# Unfortunately dictConfig() does not support altering formatter class.
# Anyway, in demo code I am not going to use dictConfig().
logger = logging.getLogger()
handler = logging.StreamHandler(sys.stderr)
handler.setFormatter(CryptoFormatter(LOG_FORMAT))
logger.handlers = []
logger.addHandler(handler)
logging.warning("This is secret")
logging.critical("This is not secret")
UPDATE: Thanks to the accepted answer below, now I see:
My solution seems to be pretty valid for now (very few log entries, no performance considerations, more or less trusted storage). Concerning security, the best thing I can do right now is not forgetting to prohibit the user who runs my daemon from writing to the .py and .pyc files of the program. :-) However, if the user is compromised, he still may try to attach a debugger to my daemon process, so I should also disable login for him. Pretty obvious moments, but very important ones.
Surely there are solutions being much more scalable. A very common technique is to encrypt AES keys with slow but reliable RSA, and to encrypt data with the AES that is pretty fast. Data encryption in the case is symmetric, but retrieving the AES key requires either breaking RSA, or getting it from memory when my program is running. Stream encryption with higher-level libraries and binary log file format also are a way to go, though binary log format encrypted as a stream should be very vulnerable to log corruption, even a sudden reboot due to electricity blackout may be a problem unless I do some things at a lower level (at least log rotation on each daemon start).
I changed .encode('hex') to .encode('base64').replace('\n').replace('\r'). Fortunately, the base64 codec works fine with no line ends. It saves some space.
Using an untrusted storage may require signing records, but that seems to be another story.
Checking if the string is encrypted based on catching exceptions is ok, since, unless the log is tampered with by a malicious user, it's base64 codec who raises an exception, not RSA decryption.
You seem to encrypt data directly with RSA. This is relatively slow, and has the problem that you can only encrypt small parts of data. Distinguishing encrypted from plaintext data based on "decryption doesn't work" is also not a very clean solution, although it will probably work. You do use OAEP, which is good. You may want to use base64 instead of hex to save space.
However, crypto is easy to get wrong. For this reason, you should always use high-level crypto libraries wherever possible. Anything where you have to specify padding schemes yourself isn't "high-level". I am not sure if you will be able to create an efficient, line-based log encryption system without resorting to rather low-level libraries, though.
If you have no reason to encrypt only individual parts of the log, consider just encrypting the entire thing.
If you are really desperate for a line-based encryption, what you could do is the following: Create a random symmetric AES key from a secure randomness source, and give it a short but unique ID. Encrypt this key with RSA, and write the result to the log file in a line prefixed with a tag, e.g. "KEY", together with the ID. For each log line, generate a random IV, encrypt the message with AES256 in CBC mode using said IV (you don't have any length limits per line now!) and write the key ID, IV and the encrypted message to the log, prefixed with a tag, e.g. "ENC". After a certain time, destroy the symmetric key and repeat (generate new one, write to log). The disadvantage of this approach is that an attacker who can recover the symmetric key from memory can read the messages encrypted with said key. The advantage is that you can use higher-level building blocks and it is much, much faster (on my CPU, you can encrypt 70,000 log lines of 1 KB per second with AES-128, but only around 3,500 chunks of max. 256 bytes with RSA2048). RSA decryption is REALLY slow, by the way (around 100 chunks per second).
Note that you have no authentication, i.e. you won't notice modifications to your logs. For this reason, I assume you trust the log storage. Otherwise, see RFC 5848.
I'm trying to write a simple Python solution to encrypt a file securely using a passphrase. I figured I would use something like bcrypt or pbkdf2 so that as time goes on, I could make my password hashes more and more difficult to brute force. I also figured I would use AES for the actual encryption, as it's a pretty safe standard. I'm not fixed on the encryption cipher, but I really like bcrypt.
I'm having quite a difficult time figuring out how to actually perform the encryption. Let's say I have a passphrase and a file I'd like to encrypt. I'd assume that I essentially need to do something like this:
from Crypto.Cipher import AES
from bcrypt import gensalt, hashpw
from hashlib import sha256
def encryptify(passphrase, file_name):
target_file = open(file_name, 'r')
# generate password, takes time
passphrase_rounds = 15
passphrase_salt = gensalt(rounds)
passphrase = sha256(hashpw(passphrase, passphrase_salt)).hexdigest()
# encrypt the file
encrypted_file = AES.new(passphrase, AES.MODE_CBC).encrypt(target_file.read())
At the final step, it fails with a ValueError, telling me that my key must be 16, 24, or 32 bytes long. What I'm not understanding is if what I'm doing is secure and why the last step is failing. I thought that SHA256 outputs 32 characters of data?
I'm particularly concerned about taking a bcrypt passphrase and throwing it through sha256, are there any potential security risks by doing this? I wouldn't imagine so, but then again, I'm not a cryptographer.
I can't comment about safety, but if you want your actual 32 bytes of SHA256, you need to call digest, not hexdigest. hexdigest returns a hexadecimal string representation (that would be 64 characters).
I have a 'public key' in a variable named varkey, for getting the public key I used the urllib and stored that public key in a variable. Now I want to encrypt a msg/string using the public key.
It's ok if somebody could lead me to some library.
My blog post (the passingcuriosity.com link in John Boker's answer) does AES -- a symmetric encryption algorithm -- using the M2Crypto library. M2Crypto is a Python wrapper around OpenSSL. The API is pretty much a straight translation of OpenSSL's into Python, so the somewhat sketchy documentation shouldn't be too much of a problem. If the public key encryption algorithm you need to use is supported by M2Crypto, then you could very well use it to do your public key cryptography.
I found the M2Crypto test suite to be a useful example of using its API. In particular, the RSA (in test_rsa.py), PGP (in test_pgp.py), and EVP (in test_evp.py) tests will help you figure out how to set up and use the library. Do be aware that they are unit-tests, so it can be a little tricky to figure out exactly what code is necessary and what is an artefact of being a test.
PS: As I'm new, my posts can only contain one link so I had to delete most of them. Sorry.
Example
from M2Crypto import RSA
rsa = RSA.load_pub_key('rsa.pub.pem')
encrypted = rsa.public_encrypt('your message', RSA.pkcs1_oaep_padding)
print encrypted.encode('base64')
Output
X3iTasRwRdW0qPRQBXiKN5zvPa3LBiCDnA3HLH172wXTEr4LNq2Kl32PCcXpIMxh7j9CmysLyWu5
GLQ18rUNqi9ydG4ihwz3v3xeNMG9O3/Oc1XsHqqIRI8MrCWTTEbAWaXFX1YVulVLaVy0elODECKV
4e9gCN+5dx/aG9LtPOE=
Here's the script that demonstrates how to encrypt a message using M2Crypto ($ easy_install m2crypto) given that public key is in varkey variable:
#!/usr/bin/env python
import urllib2
from M2Crypto import BIO, RSA
def readkey(filename):
try:
key = open(filename).read()
except IOError:
key = urllib2.urlopen(
'http://svn.osafoundation.org/m2crypto/trunk/tests/' + filename
).read()
open(filename, 'w').write(key)
return key
def test():
message = 'disregard the -man- (I mean file) behind curtain'
varkey = readkey('rsa.pub.pem')
# demonstrate how to load key from a string
bio = BIO.MemoryBuffer(varkey)
rsa = RSA.load_pub_key_bio(bio)
# encrypt
encrypted = rsa.public_encrypt(message, RSA.pkcs1_oaep_padding)
print encrypted.encode('base64')
del rsa, bio
# decrypt
read_password = lambda *args: 'qwerty'
rsa = RSA.load_key_string(readkey('rsa.priv2.pem'), read_password)
decrypted = rsa.private_decrypt(encrypted, RSA.pkcs1_oaep_padding)
print decrypted
assert message == decrypted
if __name__ == "__main__":
test()
Output
gyLD3B6jXspHu+o7M/TGLAqALihw7183E2effp9ALYfu8azYEPwMpjbw9nVSwJ4VvX3TBa4V0HAU
n6x3xslvOnegv8dv3MestEcTH9b3r2U1rsKJc1buouuc+MR77Powj9JOdizQPT22HQ2VpEAKFMK+
8zHbkJkgh4K5XUejmIk=
disregard the -man- (I mean file) behind curtain
From my recent python experience, python doesn't do encryption natively. You need to use an external (3rd party) package. Each of these, obviously, offers a different experience. Which are you using? This will probably determine how your syntax will vary.
You might want to have a look at:
http://www.example-code.com/python/encryption.asp
or this
http://passingcuriosity.com/2009/aes-encryption-in-python-with-m2crypto/
Have you ever heard about "RSAError: data too large for key size"?
Try your sample with more long message:
encrypted = rsa.public_encrypt('My blog post (the passingcuriosity.com link in John Boker's answer) does AES -- a symmetric encryption algorithm -- using the M2Crypto library', RSA.pkcs1_oaep_padding)
You could use MD5 or SHA1 hashing along with your key...