I'm currently using Fernet encryption which uses AES 128 keys. However my client requires using AES 256. I'm not very familiar with cryptography but here is what I understood so far.
Fernet needs a 256 bits key that is splitted in half. First half is the signing key, second one is the encryption key. As they are 128 bits long it is AES 128.
Would it be enough to double the input key and modify the implementation like below to get AES 256 ?
class Fernet(object):
def __init__(self, key, backend=None):
if backend is None:
backend = default_backend()
key = base64.urlsafe_b64decode(key) # Here 512 bits long instead of 256
self._signing_key = key[:16] # double this
self._encryption_key = key[16:] # double this
self._backend = backend
Yes, you could double the binary input, the input before the key was base 64 encoded. If the result is 256 bit secure depends on how the key is generated. So yes, it is possible to double the size check on the key, but that doesn't say much. If the input key material is 512 bits with a security level of 512 bits then yes, then splitting the key is fine.
Personally I would recommend (and I have recommended it in the past to Fernet) to use HKDF to derive the two keys instead of just splitting the key in two. I cannot see how the key is generated, but if it is generated by PBKDF2 - which Fernet does use to create keys from passwords - then PBKDF2 may require double the amount of work to generate 512 bits, while the attacker will only have to generate 256 bits to perform an attack (and therefore perform half of the work).
Note that using base64 encoding is not great for keys as strings are hard to delete from memory in most runtimes; it's much better if the keys are stored in a key store.
The Fernet Specification is to use AES-128. If you modify the algorithm to use AES-256, then you would no longer be using Fernet. If it's a requirement that you use both Fernet and AES-256, I would recommend encrypting your payload independently using AES-256 and then applying the Fernet algorithm to the result. This essentially encrypts it again using AES-128.
Related
When I used rsa library to encrypt content in Python,
I found that even if the same public key and the same plaintext were used,
the output content was different each time, and the output ciphertext could be decrypted perfectly.
So I want to know how the RSA encryption algorithm implements this algorithm with different encryption results each time.
The following is the source code and the ciphertext output for many times.
import rsa
data = b'hello, world'
pk = rsa.PublicKey(21968272887747488664299300886573437453854580842272801065486318320328573181104433915148345103361664593733184722692105149694142557011266255075972021704711966860643495011049367729520386363274015109405027569939049707059547205662044677513224725454246882263137472476944688288600202939249708651097639414591301098996178101611307541565108035735952182518865647460401330824147744542993709272159435504287548711774248609991298003738752699597664282754244110245104529559246443251024491287411685325071990133422302961361831613169335261576570530061643400976849033234171349450189113706076777344091951159628029458250885131329209309850429, 65537)
sk = rsa.PrivateKey(21968272887747488664299300886573437453854580842272801065486318320328573181104433915148345103361664593733184722692105149694142557011266255075972021704711966860643495011049367729520386363274015109405027569939049707059547205662044677513224725454246882263137472476944688288600202939249708651097639414591301098996178101611307541565108035735952182518865647460401330824147744542993709272159435504287548711774248609991298003738752699597664282754244110245104529559246443251024491287411685325071990133422302961361831613169335261576570530061643400976849033234171349450189113706076777344091951159628029458250885131329209309850429, 65537, 7180742814003184493745817226790609535628314246962295259545720906634095162818242875479619891118201610188935763454388765380592975819694916096822751254380575157372246976924478622789961650274744826184819271605876418277150620865958482714928972468695190683750109638846897363602141498155351308783613387153774908482554823734710213533339079775940427840254792667407339506634483414544868884993644469123554250547973774825288728499603644573043340903253662627022861078040710813466717381393318974263956822836617559198769733538785368579523554468493535497334351910973554355558084517450711717078208243534059900951053098416621979162953, 2892399658197458942905975614589062229163400545478597547382814345027395128547900843767403239802516658965367060847402270250006453487328128143951683257674546551047677883067394312961875875837583648708792776670850392284514504120294996660277476938434444686489314576152155327763997732075822518345380214599954128122325100250621109610911, 7595171996887213720796562116779069406951367089854155042546817829399701614804640519699383335239152053864712615020908685785110173445687693446414448808069297671341400340127530462352491976340390927112062123224788804186559233620266300549932283394695195359373967318632526999572685782623554155939)
print(rsa.encrypt(data, pk)
# 1
b'\x17T\xc0\x03\xa4\xa6\xc06\x83\xdcM\xe5\xf9\xd8t\xc9>\xad}\xc9\x15[\xcc!\x19\x97/\xbf\xc7\xe4\xcbhu\x8d\xfb&\x18\x84\xc8e\xec\xe1\n\xfd$\x92\xda\x12S\x0f\r\xba\x81y\x88E\x9ceu\xd9\xd2Z\xf8\xc3\xd3&\xf2\xf7j\t\t\xf2\xc6w\xf6\x9a7\xbd\x01\x96\xad\xf5\x9e\xf4\xa8,\xd2\x19b\x0f\x05\x0c\xd8G\xe66\x91\x85.\xbdX\x0b\xd9H\xb14\xc6\x88\xb5\xd7\x1f\xed\xf7\xb4\x10\xb7\xad\x9f\xab\x01\r(\r*\xd90\x84\xba\xfb\xd9\x94HK\xdf\xaf\xa0\xf2\x98\x96\xb6*b\xb5\xc0\xa6\xe5A[\x9fwf\x18\x08v\x85\t\xb7\xf7\x97\xc74\xe5{;9qw\xb1u>\t`\xfd\x10\xfbu\xfb\xf5\x11\xe9\xc1\xa0I\x96\x03\xa5\x84\x0b\xcd\x060\xa1\xb1\xbcs|\xfe\xf3N\xad\xddA\xe2l\xf83N\xae\x9c\xbe\x1568\xe9\xf5\xfdn\xe9\xbc\x98\xb5\xb9Bn\xf1]!\x86\xd39\xd2<&\xd6}\x9a\xe2\xa4|\xf0\x9a\xaf\xac\x08^\x93\x174\n~L<+=\x8d\x95'
# 2
b'5\xbc\xb2\xaa\x16\'\xa2\x93\x16D\'S\xfc\x9fm\xc9\xbbF\xa6:dN\x91f\xc1\xaa\x05\xeb\xe4\x16|\xd3\x07#\xd5\xda\xe9\x9b\xd0V\xd4\xb0#Y\xf2G\x0c\xae\xb7A\x9a\xaa\xb8^\xf8\xea\xddj%\xd0\xe8w\xb2\xf1\x9c\xf8D\xcc\x9b\xfe\xea\x16hT\x81\'u`\x10"\xaf\xe3\xd3#\xa0\xc2\x18\x8f^lE\xb0H\xe8\xd5\xf2\x8e\xd8\x8fq;\xd7B]\xc8j\x94\'0\xb0\x80\x0f\xd3\xd1\x90I\x1eL\x91y\x8dA\x01\xda>x`\x0b}6:\xb6o\xcf\xd1=\x15p\xdb\x16\xd3bF\xd5\xc9\\\x86\x1b\xeb\xc4H\x11\x04\xa9o\xe1\xffSF\xe3\xc1\x99\x05\xc44\x03\x86\x81\xbb#>\xfb\xc2\x0bscbW\x0f\xb8\x92\x81\xbb\x19c\xd1n\t\xa4sI\x91+\x97\x9e\x0b\xf1\x8b\xd2;\xa9NV\xc1\xb0#\xd1\xa24P\xce\x93US\xf5\x97=m\xb3\xb6\xd3\x9b\'\xade\x1e\xbc\x80\x13C\x99\x93\x89&\xbd\xde\x83f\\H6\xad2\nFM\xf07q\xe9`\xb1H\x98#X'
# 3
b"'E\xdb\xfd\xe4\xf9\x0c\xe1\xa4l\xaaq\x0e#\xde2\xe9\xe4\x12\xb3\xc2d\xd1W\xde*\x8d<\xcb\x1a\xea\xb4\xb86\x9bV0\r\xef\xfb\xafg\xe8\x1eHzg\x03I\x99ta\xad\x84[r.E\xbb\xc2\xae\xf1\xc2\xafd\xcb\xa6`\xf0)U\x85\xb1\n0\xb2\x05\x17s\xa3\xe3f\xb7\xda\x08\xd1\xae#\xd8\xa7\x90Tce\xc2\xac\xf3Q\x81\xbe1\x92\x8d\xcb\xbf\xfa\x88\xf3'\xe8\xa1\x9e\x9e\xae~\xb90Uq\x98\xe6\x17b\x9d]1\xf6\xabirw\xbc\x89\xae\xd8\xdf\x8a\xf5\xf1\xd4*~\x94\xe38\x1f$\x0e\x94t\xb64\x83q\xf8\x8f\xd6pR\xd4%\xf8\x1cv\xc5\xfe\x8d]\xcfy\xff\xb9\xc7\x10\xaao%\xa8\x13\xce6#Y\xfa\x06\xb8\xab(H^\xd8\x1a\xb63\xb0\xb0c\xe0\x11#\xa9\t\xdd\xa8\\\xeag\xc6H\xa5L\x0b\x10\xdb\xa9\xc44\xdcZ\xf1`\xa2\xc1^;\x1d\xdf\xbf\x92\x894\x847\xe9\x16\x15\xad\xd1c\xf9.\xc21\x02\x85\xb1\x0b\x96=\xf3D\xdf\xf7\xbep\x9c"
# 4
b'$\x82\xc8\x95\xcb\xdaq\xc0\x16\x0e\xef\xb6\xc8\x89\xabKQafM\x10^\x11\xea2\xfc\x8b\x0b~H\xfd\xe5\xe0\x80\x81<\xae\xb7\xfeT)K\xb3\x96\xc0y\x83e\x93\xae\xdb\x93\x82\xea\xb7\xb7\xdbQJX\xb2\xfdM\xf2(A6+e\xb7\x89\x8a\xba6\xb7\xa3\xde*\xea\xe0\x1cR\xa9i\x8a\x9aEK\xa2T\xebM\xa9\x1d\x96\x87\xaf\xb2I\xcej!"\xe2\xc8\xc08\x94\x8a\x18\x1d\t\x11`\xdf*\xbc\xb9\xf6J\xbci\xb3\xcc\xde\xb0\xa5\x98b}o\x94\xbe\xe0\x7f\xe2J\x8a\xa2)R{U\xdfu\xf6UO\xc2C\xf3\'\x87c\x1e\xc6\xe0\xbe\x879\xa5N\xb3J\xc8Cz\x9b\xa7\xec\x90[\xa8\x8a\xac\xeep\\ar\xbd\x94O\xce]\x1fw\x1bm|K\xce\x15\xf6\xcc\xc5\xc84\x9a\x00Z\x0b\xfd\xe9\xfb^6\x9b\xfd\xeb\x8c\xf1h\xda\x17\xc4\xb0\x08\\-\n7\x9e\x1f\x1d\xa7\xb4\xb9\xf0wq\x9a\x15G\xc5\x90\xf5\x00\x89\tI\x16\x90\xbcI\x80z\x90\xdb\nO\xdc\xe5\x8fh\xca'
Any asymmetric encryption method has to be randomized, so that if you encrypt the same plaintext twice, you don't get the same ciphertext. Otherwise it would be very insecure. Anyone who has the public key can encrypt something. Suppose an adversary has a ciphertext, they want to find out the plaintext, and they have partial information about the plaintext (e.g. they know it's a message in a certain format, but they don't know the exact content). They can try encrypting possible values of the plaintext until the result is the ciphertext they want to break. But since the encryption is randomized, they need to use the same data input and the same random value, otherwise they won't get the same ciphertext. And the adversary can't know what random value went into the ciphertext they want to break.
For RSA, in practice, there are two methods for doing encryption. Both are defined by the document known as PKCS#1. Both take the plaintext to encrypt and apply a transformation to it that involves either appending random data (PKCS#1 v1.5) or masking with random data (PSS). Then the result undergoes the well-known exponentiation part of RSA.
You can use the exponentiation to inspect a ciphertext.
n = 21968272887747488664299300886573437453854580842272801065486318320328573181104433915148345103361664593733184722692105149694142557011266255075972021704711966860643495011049367729520386363274015109405027569939049707059547205662044677513224725454246882263137472476944688288600202939249708651097639414591301098996178101611307541565108035735952182518865647460401330824147744542993709272159435504287548711774248609991298003738752699597664282754244110245104529559246443251024491287411685325071990133422302961361831613169335261576570530061643400976849033234171349450189113706076777344091951159628029458250885131329209309850429
e = 65537
d = 7180742814003184493745817226790609535628314246962295259545720906634095162818242875479619891118201610188935763454388765380592975819694916096822751254380575157372246976924478622789961650274744826184819271605876418277150620865958482714928972468695190683750109638846897363602141498155351308783613387153774908482554823734710213533339079775940427840254792667407339506634483414544868884993644469123554250547973774825288728499603644573043340903253662627022861078040710813466717381393318974263956822836617559198769733538785368579523554468493535497334351910973554355558084517450711717078208243534059900951053098416621979162953
c1 = b'\x17T\xc0\x03\xa4\xa6\xc06\x83\xdcM\xe5\xf9\xd8t\xc9>\xad}\xc9\x15[\xcc!\x19\x97/\xbf\xc7\xe4\xcbhu\x8d\xfb&\x18\x84\xc8e\xec\xe1\n\xfd$\x92\xda\x12S\x0f\r\xba\x81y\x88E\x9ceu\xd9\xd2Z\xf8\xc3\xd3&\xf2\xf7j\t\t\xf2\xc6w\xf6\x9a7\xbd\x01\x96\xad\xf5\x9e\xf4\xa8,\xd2\x19b\x0f\x05\x0c\xd8G\xe66\x91\x85.\xbdX\x0b\xd9H\xb14\xc6\x88\xb5\xd7\x1f\xed\xf7\xb4\x10\xb7\xad\x9f\xab\x01\r(\r*\xd90\x84\xba\xfb\xd9\x94HK\xdf\xaf\xa0\xf2\x98\x96\xb6*b\xb5\xc0\xa6\xe5A[\x9fwf\x18\x08v\x85\t\xb7\xf7\x97\xc74\xe5{;9qw\xb1u>\t`\xfd\x10\xfbu\xfb\xf5\x11\xe9\xc1\xa0I\x96\x03\xa5\x84\x0b\xcd\x060\xa1\xb1\xbcs|\xfe\xf3N\xad\xddA\xe2l\xf83N\xae\x9c\xbe\x1568\xe9\xf5\xfdn\xe9\xbc\x98\xb5\xb9Bn\xf1]!\x86\xd39\xd2<&\xd6}\x9a\xe2\xa4|\xf0\x9a\xaf\xac\x08^\x93\x174\n~L<+=\x8d\x95'
print(binascii.unhexlify('0' + hex(pow(int(binascii.hexlify(c1), 16), d, n))[2:]))
That last value is the padded plaintext. You can see the data in there, with padding before it. This is the PKCS#1 v1.5 padding method (which is insecure unless used very carefully, and should not be used except for backward compatibility with systems that require it).
I'm having trouble loading an RSA512 public key from a .bin file into python. The issue mainly stems from the fact that I don't know what format the key is stored in. This is the only description of the file I was given.
"key.bin - Raw binary bytes of RSA 512 bit public key and exponent. Used to verify signature of incoming
packets."
I don't know if this is helpful but here are bytes printed in python of the .bin file.
9902c4a66b1ff76392919e7bbc35d51a5128b9da03e131b489d5ed01c1d075fc4c139a9952e9a3b040d984219a4aef0d421f6b8f9c79e1c3c35a218ecba54dc9010001
The goal of the actual challenge is to build a udp server that verifies the digital signature and integrity of an incoming packet. Currently i'm using python 2.7 with the cryptography library. Documentation can be found below.
https://cryptography.io/en/latest/hazmat/primitives/asymmetric/rsa/?highlight=rsa%20512
I have already tried the code below but I get the same error for the first two formats and a slightly different one for the third.
with open("key.bin", "rb") as key_file:
private_key = serialization.load_der_public_key(key_file.read(), backend=default_backend())
ValueError: Could not deserialize key data.
with open("key.bin", "rb") as key_file:
private_key = serialization.load_pem_public_key(key_file.read(), backend=default_backend())
ValueError: Could not deserialize key data.
with open("key.bin", "rb") as key_file:
private_key = serialization.load_ssh_public_key(key_file.read(), backend=default_backend())
ValueError: Key is not in the proper format or contains extra data.
Also the hashing algorithm used for verification is SHA256 but that is probably irrelevant.
key.bin - Raw binary bytes of RSA 512 bit public key and exponent. Used to verify signature of incoming packets.
In a RSA-512 key, the modulus is a 512-bit number, which fits in 64 bytes or 128 hexadecimal digits. Your file is represented by 134 hex digits, so it's likely that 128 of these digits are the modulus and the rest is the public exponent and possibly metadata.
The public exponent is almost always 3 or 65537=0x010001. Given that key.bin ends with 010001 in hex, a reasonable guess is that those last 3 bytes are the public exponent, and the first 64 bytes are the modulus.
with open("key.bin", "rb") as key_file:
n_bytes = key_file.read(64)
e_bytes = key_file.read(3)
You now need to figure out whether the encoding is little-endian or big-endian. You can't tell from the public exponent because it's palindromic. So try both possibilities:
n = int(n_bytes.encode('hex'), 16)
or
n = int(reversed(n_bytes).encode('hex'), 16)
Since you have the key in an ad hoc format, rather than a standard format that is used in real life, you're probably meant to use arithmetic primitives rather than a cryptography library to work with the key.
Your key is not encoded in a known standard. You need to extract the modulus and exponent, and then build the public key out of them.
The modulus defines the RSA key size and is therefore 512 bits or 64 bytes as unsigned big endian value. The public exponent may have any size, but is usually small. The most used exponent value is 010001 in hexadecimals, which is the fifth prime of Fermat (also called F4, zero based index). Better however to simply get the first 64 bytes and assume the rest encodes is the public exponent.
So you can use RSAPublicNumbers to create the values from the modulus n and exponent e. Trick is to make sure that you create the modulus as positive value instead of a negative value.
Let's say that data is the binary data read from the file. Then you can get the public key in the following way.
You may want to use 'little' instead of 'big' if the following doesn't work (big endian is the RSA default, but you never know). However, in your case the little endian value is dividable by e.g. 11, so that's not a likely modulus value (the prime values should be close to half the key size in bits to be secure).
modsize = 512 // 8
modBytes = data[slice(0, modsize)]
mod = int.from_bytes(modBytes, byteorder='big')
expBytes = data[slice(modsize, None)]
exp = int.from_bytes(expBytes, byteorder='big')
pubkey = RSAPublicNumbers(exp, mod).public_key(default_backend())
Note that from_bytes has only been added in Python 3.2. RSAPublicNumbers is a bit weird in the sense that it takes the exponent parameter before the modulus. Every other API that I've seen takes the modulus before the exponent.
So, in pyDes, a cryptographic library of DES, there is an API, which goes like this pyDes.des(key, [mode], [IV], [pad], [padmode]). An usage of it goes like this k = des("DESCRYPT", CBC, "\0\0\0\0\0\0\0\0", pad=None, padmode=PAD_PKCS5) - where I can either use CBC or ECB mode of encryption.However, as an assignment from my professor, I am told to encrypt using pyDes library, but using CBC and Counter Mode manually.
I managed to do CBC mode fine, unfortunately I am stuck with the counter mode.
Using the given api of des(key, CBC, IV ...) I can only use IV when I use CBC or ECB mode of operation. I can not use it something like des("hello", mode = None, "foo",....) where "foo" is my IV.( I am supposed to implement Counter mode of operation and the iv is random in every single iteration)
So, my question is did anyone faced this issue, and tried to overcome it.
The main operation that you need to isolate in order to implement some mode is the actual block cipher without a mode of operation or padding. pyDes doesn't seem to provide direct access to the block cipher directly, but you can emulate it easily with the ECB mode. ECB is a simple execution of the block cipher on all input blocks in the same way.
The idea would be to create a counter input stream, execute ECB on the input stream to get the key stream and then XOR every byte of the plaintext with the corresponding byte in the key stream.
Steps for CTR mode:
Generate a random nonce (IV) in the range of 0 to 1<<64 (DES block size) which is the starting counter:
import random
r = random.SystemRandom()
nonce = r.randrange(0, 1<<64)
Convert the counter for each block of the plaintext to bytes with struct.pack('>Q', counter) and increase the counter by one
Repeat until you have at least as much bytes in the counter stream as in the plaintext
Encrypt the counter input stream with ECB mode and any available padding
XOR the key stream and the plaintext and throw away the rest of the key stream (if any)
Since CTR mode is a stream cipher, you can use the exact same operation for decrypting with the only difference that the nonce must be supplied from outside. You can prepend the nonce to the ciphertext so that it can be used for decryption. It doesn't have to be secret, but it needs to be unique if the same key is used.
Note that the block size of DES and 3DES doesn't permit to encrypt many ciphertexts or long ciphertexts with CTR under the same key. If you do then you need to change to a block cipher with a bigger block size like AES.
When I test RSA encryption on my laptop using PyCrypto (Archlinux, package: python-crypto/python2-crypto), I used a 1024 key generated by RSA module to encrypt a random data, and it produced a 127 bytes length cipher.
A simple code following: (I got the values when debugging)
from Crypto.PublicKey import RSA
pubkey = b'-----BEGIN PUBLIC KEY-----\nMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDq+qbaMTZtPH3LuXLrAn37YGzc\ngrL7ieTILtkXTl5PIozJUQZ06bQXr/uS+FtvYNSvaT53ZpSyKmVmWtoX7lFzA6FW\nsILFTgFUDNRnPIQv1rQb16wi694rKPRe1uIr8/hthXtTec8b2aJovizQOlkXY0Pq\nZohNGofi02xlUD8KsQIDAQAB\n-----END PUBLIC KEY-----'
prikey = b'-----BEGIN RSA PRIVATE KEY-----\nMIICXgIBAAKBgQDq+qbaMTZtPH3LuXLrAn37YGzcgrL7ieTILtkXTl5PIozJUQZ0\n6bQXr/uS+FtvYNSvaT53ZpSyKmVmWtoX7lFzA6FWsILFTgFUDNRnPIQv1rQb16wi\n694rKPRe1uIr8/hthXtTec8b2aJovizQOlkXY0PqZohNGofi02xlUD8KsQIDAQAB\nAoGBAMkKEI0ng8Br+9i8XqTQ6gaTVjBHpmhtbw8SfexhwXCFR9zJ9PM8LDgD+gKh\neGFPgEhfi/FOE7Rnb3/mBShqXsWbqz7STJ05GOxtKo+L1z5K7X4E9WmVjIEVU46I\nhF43LJQvoDjQRbZh2cUMSYUR8+LqJJd6MFdhLJhEIf+LhCbBAkEA71lRBiSwZH/8\nsaUE4qZ/vxkS65czBcWLSCgn+7D/kvunX1hxqi3zTxMn4gyluw3IICzvLFgdDG6f\nUZk23aDcyQJBAPtTgvi4lYAIoeh6Xx8NZxroVNVBlP9BzJTBCcnX1Ym0aC/p+6n8\n7Lu9bkKk/hb0r7Oy76wzxObWv9uvRQNp+qkCQQCoOy8oEkGpYgxLEKIObNj9iLIz\nxWKne+IaJZ902UPKG/fYnGHIK+QIgH5X9GvIvjcb5nl1wbkpM9fnkrltrdOBAkBe\n7LbuHEGTHy+P8BBXWSeVOSU5etC87GxJzvNUginMHhCv8C82kCoV6sFneIvjvb1T\nIQV3RAJdscS7Q+LMHE4pAkEAzp2o8+2+9QJwzkpxGyNjJ7ZECQsZIb7MOH7LYhX0\ncnwffXFt4ttcwbyX2SdhCVPBDkczkJkOzcnEqtjoWt+dBw==\n-----END RSA PRIVATE KEY-----'
pub = RSA.importKey(pubkey)
data = b'\xc9\xc5\xa9\x1b\xc2\x0f\x05\xf0\xe3\xe1W\x9d\x94b\xc6 '
cipher = pub.encrypt(data, 0)[0]
print(len(cipher))
This will print 127 (normally it would be 128 for 1024 bits key), and I don't know why.
You are not using a correct encryption scheme. From the documentation:
Even though you may choose to directly use the methods of an RSA key object to perform the primitive cryptographic operations (e.g. _RSAobj.encrypt), it is recommended to use one of the standardized schemes instead (like Crypto.Cipher.PKCS1_v1_5 or Crypto.Signature.PKCS1_v1_5).
although nowadays the more modern/safe Crypto.Cipher.PKCS1_OAEP should be preferred over Crypto.Cipher.PKCS1_v1_5.
If you use one of these schemes then the output will always be 128 bytes. The reason for that is that PKCS#1 specifies a function called I2OSP, which converts the result of the modular exponentiation (which is a number bounded by the modulus/the key size) to a static number of octets, the key size to be exact.
The output of the direct encrypt function is what is called raw or textbook RSA: just modular exponentiation. This will just return the number, which may have leading zero bits. How many depends on chance, (somewhat) on the value of the modulus and if signed or unsigned encoding is used.
Here is an example:
p = 11, q = 5, N = p*q = 55, choose encryption exponent e = 3, so d = e^-1 mod (p-1)(q-1) = 27.
If I want encrypt x=13, x^e=13^3=52mod55.
I understand how to encrypt a number which is less than N, but how to encrypt a number which is larger than N?
I know if X is larger than N, we should decompose X into several parts and encrypt them respectively, but I don't know how RSA decompose it?
Optional question:
How to encrypt a file with RSA on IOS or python?
You don't use RSA to encrypt long messages.
The correct approach is using hybrid encryption instead:
Generate a random AES key, encrypt the actual data with AES. Preferably using an authenticated mode like AES-GCM.
Encrypt the AES key with RSA. This key (126 to 256 bits) is small enough to fit within one RSA block. For example using small and thus weak 1024 bit RSA keys you have 500-700 bits for the actual data (the rest is consumed by the padding).
The ciphertext consists of both the RSA encrypted AES key and the AES encrypted file.
It's essential for security to apply padding here, namely OAEP. Most other paddings, including the popular PKCS#1v1.5 padding are not secure.
Don't try to split the file into blocks which you encrypt with RSA. There are no standard ways for doing this, because it's a bad idea.
The RSA algorithm does not handle decomposition of the message at all. It just encrypts fixed-size integers. This kind of encryption algorithms is called a block cipher, because it encrypts messages in fixed-size "blocks".
How the blocks are obtained is generally not specified by the block-cipher itself. So, you have to decide how to split the message. One of the possible ways to decompose an integer into fixed-size blocks is to convert it to base N, and encrypt each digit separately.
Note that you should not encrypt each digit independently from the others, because that wouldn't be safe. In fact doing so is equivalent to using a monoalphabetic cipher. . There are different mode of operations for block ciphers that you can use to safely encrypt multiple blocks. You should read the wikipedia page to learn about them.