How to obfuscate files without using any external libraries?

How to obfuscate files without using any external libraries? - python

I am currently trying to obscure the contents of files or simply txt files without using any libraries. I know that this won't be very secure at all but what I basically want is a program that asks you what the password to "encrypt" it with is, then it asks what the files name is and then it finds that file and "encrypts" it. Then another program is used to "decrypt" it so it asks for the password and filename and then "decrypts" it. I don't care about actual security so if it can be easily opened it's fine I just need it so it doesn't just open if you click the file.
On top of that I don't want it to use ANY libraries so no pycrypto or anything like that.
I am on 64 bit windows.
I also am a complete beginner in tthe world of code and only know basic things such as how to get user input, print stuff, if loops and while loops.
Thanks in advance!

I don't know if this qualifies as an "external library" in your mind, but if you're on a linux machine you probably have the gpg command available to you. This is a reasonably* secure encryption protocol, which you could access from python - or directly from the command line, if you just want the files protected and you don't care about having it done through python.
Alternatively, you could bang together a trivial mechanism for obscuring a file's contents based on a known password. For example, you could "stretch" the password to the length of the file text (multiply the string by (1 + (text length / password length)) and then zip the two together. This gives you a bunch of tuples, which can by converted to their ordinal value (ord('f')=>102, for example) and xored together (ord('f')^ord('b')=>4) and converted back to chars (chr(4) => the unprintable '\x04'). The resulting chars are your cyphertext.
All of this is trivial to break, of course, but it's easy to implement, and decryption is trivial.
*intentional understatement :)

You can try using the password as a key to encrypt it. May be a logical operation on the file on a binary level such as or, and, or others will be able to encrypt it very simply -- but it won't be secure like you mentioned.

You can use XTEA (xTended Tiny Encryption Algorithm) by copying the python code (xTended Tiny Encryption Algorithm) code into your project, it is only 28 lines of python. It has been subjected to cryptanalysis and shown to be reasonably secure.
import struct
def crypt(key,data,iv='\00\00\00\00\00\00\00\00',n=32):
def keygen(key,iv,n):
while True:
iv = xtea_encrypt(key,iv,n)
for k in iv:
yield ord(k)
xor = [ chr(x^y) for (x,y) in zip(map(ord,data),keygen(key,iv,n)) ]
return "".join(xor)
def xtea_encrypt(key,block,n=32,endian="!"):
v0,v1 = struct.unpack(endian+"2L",block)
k = struct.unpack(endian+"4L",key)
sum,delta,mask = 0L,0x9e3779b9L,0xffffffffL
for round in range(n):
v0 = (v0 + (((v1<<4 ^ v1>>5) + v1) ^ (sum + k[sum & 3]))) & mask
sum = (sum + delta) & mask
v1 = (v1 + (((v0<<4 ^ v0>>5) + v0) ^ (sum + k[sum>>11 & 3]))) & mask
return struct.pack(endian+"2L",v0,v1)
def xtea_decrypt(key,block,n=32,endian="!"):
v0,v1 = struct.unpack(endian+"2L",block)
k = struct.unpack(endian+"4L",key)
delta,mask = 0x9e3779b9L,0xffffffffL
sum = (delta * n) & mask
for round in range(n):
v1 = (v1 - (((v0<<4 ^ v0>>5) + v0) ^ (sum + k[sum>>11 & 3]))) & mask
sum = (sum - delta) & mask
v0 = (v0 - (((v1<<4 ^ v1>>5) + v1) ^ (sum + k[sum & 3]))) & mask
return struct.pack(endian+"2L",v0,v1)
Attribution, code from: ActiveState Code » Recipes

Related

AES/PKCS5/SHSA256

I make encryption and decryption using AES/PKCS5/CBC in Python
as I know Java has option AES/CBC/PKCS5.
Python need to make program work as PKCS5.
PKCS5 is an 8 byte block, but when I print AES.block_size in Python, it is printed as 16 byte block.
According to definition of PKCS5, I tried:
text = text + (8 - len(text) % 8) * chr(8 - len(text) % 8)
When I encrypt that text, an error message shows input must be multiple of 16 length.
When I change AES.block_size to 8, all the code works well.
However, I think changing 8 to AES.block_size not fit to definition of PKCS5.
Please help me understand.

Create a paddding function
def pad(text):
pad= AES.block_size - (len(text) % AES.block_size))
assert(pad >= 1 && pad <= AES.block_size)
return text + chr(pad)*pad
and apply the encryption to pad(text) instead.
There are historical reasons why some block sizes correspond to the name PKCS5 and others to PKCS7 or even PKCS8. The padding scheme is the same. One is historically tied to DES (before AES even existed) and all block ciphers in the then current standards (PKCS is such a standard) had 8 byte blocks.
Just use the above for AES and the correct one for AES in Java and you'll be OK in principle.

Reading A Binary File In Fortran That Was Created By A Python Code

I have a binary file that was created using a Python code. This code mainly scripts a bunch of tasks to pre-process a set of data files. I would now like to read this binary file in Fortran. The content of the binary file is coordinates of points in a simple format e.g.: number of points, x0, y0, z0, x1, y1, z1, ....
These binary files were created using the 'tofile' function in numpy. I have the following code in Fortran so far:
integer:: intValue
double precision:: dblValue
integer:: counter
integer:: check
open(unit=10, file='file.bin', form='unformatted', status='old', access='stream')
counter = 1
do
if ( counter == 1 ) then
read(unit=10, iostat=check) intValue
if ( check < 0 ) then
print*,"End Of File"
stop
else if ( check > 0 ) then
print*, "Error Detected"
stop
else if ( check == 0 ) then
counter = counter + 1
print*, intValue
end if
else if ( counter > 1 ) then
read(unit=10, iostat=check) dblValue
if ( check < 0 ) then
print*,"End Of File"
stop
else if ( check > 0 ) then
print*, "Error Detected"
stop
else if ( check == 0 ) then
counter = counter + 1
print*,dblValue
end if
end if
end do
close(unit=10)
This unfortunately does not work, and I get garbage numbers (e.g 6.4731191026611484E+212, 2.2844499004808491E-279 etc.). Could someone give some pointers on how to do this correctly?
Also what would be a good way of writing and reading binary files interchangeably between Python and Fortran - as it seems like that is going to be one of the requirements of my application.
Thanks

Here's a trivial example of how to take data generated with numpy to Fortran the binary way.
I calculated 360 values of sin on [0,2π),
#!/usr/bin/env python3
import numpy as np
with open('sin.dat', 'wb') as outfile:
np.sin(np.arange(0., 2*np.pi, np.pi/180.,
dtype=np.float32)).tofile(outfile)
exported that with tofile to binary file 'sin.dat', which has a size of 1440 bytes (360 * sizeof(float32)), read that file with this Fortran95 (gfortran -O3 -Wall -pedantic) program which outputs 1. - (val**2 + cos(x)**2) for x in [0,2π),
program numpy_import
integer, parameter :: REAL_KIND = 4
integer, parameter :: UNIT = 10
integer, parameter :: SAMPLE_LENGTH = 360
real(REAL_KIND), parameter :: PI = acos(-1.)
real(REAL_KIND), parameter :: DPHI = PI/180.
real(REAL_KIND), dimension(0:SAMPLE_LENGTH-1) :: arr
real(REAL_KIND) :: r
integer :: i
open(UNIT, file="sin.dat", form='unformatted',&
access='direct', recl=4)
do i = 0,ubound(arr, 1)
read(UNIT, rec=i+1, err=100) arr(i)
end do
do i = 0,ubound(arr, 1)
r = 1. - (arr(i)**2. + cos(real(i*DPHI, REAL_KIND))**2)
write(*, '(F6.4, " ")', advance='no')&
real(int(r*1E6+1)/1E6, REAL_KIND)
end do
100 close(UNIT)
write(*,*)
end program numpy_import
thus if val == sin(x), the numeric result must in good approximation vanish for float32 types.
And indeed:
output:
360 x 0.0000

So thanks to this great community, from all the advise I got, and a little bit of tinkering around, I think I figured out a stable solution to this problem, and I wanted to share with you all this answer. I will provide a minimal example here, where I want to write a variable size array from Python into a binary file, and read it using Fortran. I am assuming that the number of rows numRows and number of columns numCols are also written along with the full array datatArray. The following Python script writeBin.py writes the file:
import numpy as np
# Read in the numRows and numCols value
# Read in the array values
numRowArr = np.array([numRows], dtype=np.float32)
numColArr = np.array([numCols], dtype=np.float32)
fileObj = open('pybin.bin', 'wb')
numRowArr.tofile(fileObj)
numColArr.tofile(fileObj)
for i in range(numRows):
lineArr = dataArray[i,:]
lineArr.tofile(fileObj)
fileObj.close()
Following this, the fortran code to read the array from the file can be programmed as follows:
program readBin
use iso_fortran_env
implicit none
integer:: nR, nC, i
real(kind=real32):: numRowVal, numColVal
real(kind=real32), dimension(:), allocatable:: rowData
real(kind=real32), dimension(:,:), allocatable:: fullData
open(unit=10,file='pybin.bin',form='unformatted',status='old',access='stream')
read(unit=10) numRowVal
nR = int(numRowVal)
read(unit=10) numColVal
nC = int(numColVal)
allocate(rowData(nC))
allocate(fullData(nR,nC))
do i = 1, nR
read(unit=10) rowData
fullData(i,:) = rowData(:)
end do
close(unit=10)
end program readBin
The main point that I gathered from the discussion on this thread is to match the read and the write as much as possible, with precise specifications of the data types to be read, the way they are written etc. As you may note, this is a made up example, so there may be some things here and there that are not perfect. However, I have used this now to program a finite element program, and the mesh data was where I used this binary read/write - and it worked very well.
P.S: In case you find some typo, please let me know, and I will edit it out rightaway.
Thanks a lot.

Optimization: Python, Perl, and a C Suffix Tree Library

I've got about 3,500 files that consist of single line character strings. The files vary in size (from about 200b to 1mb). I'm trying to compare each file with each other file and find a common subsequence of length 20 characters between two files. Note that the subsequence is only common between two files during each comparison, and not common among all files.
I've stuggled with this problem a bit, and since I'm not an expert, I've ended up with a bit of an ad-hoc solution. I use itertools.combinations to build a list in Python that ends up with around 6,239,278 combinations. I then pass the files two at a time to a Perl script that acts a wrapper for a suffix tree library written in C called libstree. I've tried to avoid this type of solution but the only comparable C suffix tree wrapper in Python suffers from a memory leak.
So here's my problem. I've timed it, and on my machine, the solution processes about 500 comparisons in 25 seconds. So that means, it'll take around 3 days of continuous processing to complete the task. And then I have to do it all again to look at say 25 characters instead of 20. Please note that I'm way out of my comfort zone and not a very good programmer, so I'm sure there is a much more elegant way to do this. I thought I'd ask it here and produce my code to see if anyone has any suggestion as to how I could complete this task faster.
Python code:
from itertools import combinations
import glob, subprocess
glist = glob.glob("Data/*.g")
i = 0
for a,b in combinations(glist, 2):
i += 1
p = subprocess.Popen(["perl", "suffix_tree.pl", a, b, "20"], shell=False, stdout=subprocess.PIPE)
p = p.stdout.read()
a = a.split("/")
b = b.split("/")
a = a[1].split(".")
b = b[1].split(".")
print str(i) + ":" + str(a[0]) + " --- " + str(b[0])
if p != "" and len(p) == 20:
with open("tmp.list", "a") as openf:
openf.write(a[0] + " " + b[0] + "\n")
Perl code:
use strict;
use Tree::Suffix;
open FILE, "<$ARGV[0]";
my $a = do { local $/; <FILE> };
open FILE, "<$ARGV[1]";
my $b = do { local $/; <FILE> };
my #g = ($a,$b);
my $st = Tree::Suffix->new(#g);
my ($c) = $st->lcs($ARGV[2],-1);
print "$c";

Rather than writing Python to call Perl to call C, I'm sure you would be better off dropping the Python code and writing it all in Perl.
If your files are certain to contain exactly one line then you can read the pairs more simply by writing just
my #g = <>;
I believe the program below performs the same function as your Python and Perl code combined, but I cannot test it as I am unable to install libstree at present.
But as ikegami has pointed out, it would be far better to calculate and store the longest common subsequence for each pair of files and put them into categories afterwards. I won't go on to code this as I don't know what information you need - whether it is just subsequence length or if you need the characters and/or the position of the subsequences as well.
use strict;
use warnings;
use Math::Combinatorics;
use Tree::Suffix;
my #glist = glob "Data/*.g";
my $iterator = Math::Combinatorics->new(count => 2, data => \#glist);
open my $fh, '>', 'tmp.list' or die $!;
my $n = 0;
while (my #pair = $iterator->next_combination) {
$n++;
#ARGV = #pair;
my #g = <>;
my $tree = Tree::Suffix->new(#g);
my $lcs = $tree->lcs;
#pair = map m|/(.+?)\.|, #pair;
print "$n: $pair[0] --- $pair[1]\n";
print $fh, "#pair\n" if $lcs and length $lcs >= 20;
}

RSA Encryption using Python

I'm trying to RSA encrypt a word 2 characters at a time padding with a space using Python but not sure how I go about it.
For example if the encryption exponent was 8 and a modulus of 37329 and the word was 'Pound' how would I go about it? I know I need to start with pow(ord('P') and need to take into consideration that the word is 5 characters and I need to do it 2 characters at a time padding with a space. I'm not sure but do I also need to use <<8 somewhere?
Thank you

Here's a basic example:
>>> msg = 2495247524
>>> code = pow(msg, 65537, 5551201688147) # encrypt
>>> code
4548920924688L
>>> plaintext = pow(code, 109182490673, 5551201688147) # decrypt
>>> plaintext
2495247524
See the ASPN cookbook recipe for more tools for working with mathematical part of RSA style public key encryption.
The details of how characters get packed and unpacked into blocks and how the numbers get encoded is a bit arcane. Here is a complete, working RSA module in pure Python.
For your particular packing pattern (2 characters at a time, padded with spaces), this should work:
>>> plaintext = 'Pound'
>>> plaintext += ' '      # this will get thrown away for even lengths
>>> for i in range(0, len(plaintext), 2):
group = plaintext[i: i+2]
plain_number = ord(group[0]) * 256 + ord(group[1])
encrypted = pow(plain_number, 8, 37329)
print group, '-->', plain_number, '-->', encrypted
Po --> 20591 --> 12139
un --> 30062 --> 2899
d --> 25632 --> 23784

If you want to efficiently code the RSA encryption using python, my github repository would definitely to understand and interpret the mathematical definitions of RSA in python
Cryptogrphic Algoritms Implementation Using Python
RSA Key Generation
def keyGen():
''' Generate Keypair '''
i_p=randint(0,20)
i_q=randint(0,20)
# Instead of Asking the user for the prime Number which in case is not feasible,
# generate two numbers which is much highly secure as it chooses higher primes
while i_p==i_q:
continue
primes=PrimeGen(100)
p=primes[i_p]
q=primes[i_q]
#computing n=p*q as a part of the RSA Algorithm
n=p*q
#Computing lamda(n), the Carmichael's totient Function.
# In this case, the totient function is the LCM(lamda(p),lamda(q))=lamda(p-1,q-1)
# On the Contrary We can also apply the Euler's totient's Function phi(n)
# which sometimes may result larger than expected
lamda_n=int(lcm(p-1,q-1))
e=randint(1,lamda_n)
#checking the Following : whether e and lamda(n) are co-prime
while math.gcd(e,lamda_n)!=1:
e=randint(1,lamda_n)
#Determine the modular Multiplicative Inverse
d=modinv(e,lamda_n)
#return the Key Pairs
# Public Key pair : (e,n), private key pair:(d,n)
return ((e,n),(d,n))

Base64 and non standard

I try to create a python client for bacula, but I have some problem with the authentication.
The algorithm is :
import hmac
import base64
import re
...
challenge = re.search("auth cram-md5 ()", data)
#exemple ''
passwd = 'b489c90f3ee5b3ca86365e1bae27186e'
hm = hmac.new(passwd, challenge).digest()
rep = base64.b64encode(hm).strp().rstrip('=')
#result with python : 9zKE3VzYQ1oIDTpBuMMowQ
#result with bacula client : 9z+E3V/YQ1oIDTpBu8MowB'
There's a way more simple than port the bacula's implemenation of base 64?
int
bin_to_base64(char *buf, int buflen, char *bin, int binlen, int compatible)
{
uint32_t reg, save, mask;
int rem, i;
int j = 0;
reg = 0;
rem = 0;
buflen--; /* allow for storing EOS */
for (i=0; i >= (rem - 6);
if (j

To verify your CRAM-MD5 implementation, it is best to use some simple test vectors and check combinations of (challenge, password, username) inputs against the expected output.
Here's one example (from http://blog.susam.in/2009/02/auth-cram-md5.html):
import hmac
username = 'foo#susam.in'
passwd = 'drowssap'
encoded_challenge = 'PDc0NTYuMTIzMzU5ODUzM0BzZGNsaW51eDIucmRzaW5kaWEuY29tPg=='
challenge = encoded_challenge.decode('base64')
digest = hmac.new(passwd, challenge).hexdigest()
response = username + ' ' + digest
encoded_response = response.encode('base64')
print encoded_response
# Zm9vQHN1c2FtLmluIDY2N2U5ZmE0NDcwZGZmM2RhOWQ2MjFmZTQwNjc2NzIy
That said, I've certainly found examples on the net where the response generated by the above code differs from the expected response stated on the relevant site, so I'm still not entirely clear as to what is happening in those cases.

I HAVE CRACKED THIS.
I ran into exactly the same problem you did, and have just spent about 4 hours identifying the problem, and reimplementing it.
The problem is the Bacula's base64 is BROKEN, AND WRONG!
There are two problems with it:
The first is that the incoming bytes are treated as signed, not unsigned. The effect of this is that, if a byte has the highest bit set (>127), then it is treated as a negative number; when it is combined with the "left over" bits from previous bytes are all set to (binary 1).
The second is that, after b64 has processed all the full 6-bit output blocks, there may be 0, 2 or 4 bits left over (depending on input block modulus 3). The standard Base64 way to handle this is to multiply the remaining bits, so they are the HIGHEST bits in the last 6-bit block, and process them - Bacula leaves them as the LOWEST bits.
Note that some versions of Bacula may accept both the "Bacula broken base64 encoding" and the standard ones, for incoming authentication; they seem to use the broken one for their authentication.
def bacula_broken_base64(binarystring):
b64_chars="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
remaining_bit_count=0
remaining_bits=0
output=""
for inputbyte in binarystring:
inputbyte=ord(inputbyte)
if inputbyte>127:
# REPRODUCING A BUG! set all the "remaining bits" to 1.
remaining_bits=(1 << remaining_bit_count) - 1
remaining_bits=(remaining_bits<<8)+inputbyte
remaining_bit_count+=8
while remaining_bit_count>=6:
# clean up:
remaining_bit_count-=6
new64=(remaining_bits>>remaining_bit_count) & 63 # 6 highest bits
output+=b64_chars[new64]
remaining_bits&=(1 << remaining_bit_count) - 1
if remaining_bit_count>0:
output+=b64_chars[remaining_bits]
return output
I realize it's been 6 years since you asked, but perhaps someone else will find this useful.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to obfuscate files without using any external libraries? - python

You can try using the password as a key to encrypt it. May be a logical operation on the file on a binary level such as or, and, or others will be able to encrypt it very simply -- but it won't be secure like you mentioned.

Related

AES/PKCS5/SHSA256

Reading A Binary File In Fortran That Was Created By A Python Code

Optimization: Python, Perl, and a C Suffix Tree Library

RSA Encryption using Python

Base64 and non standard

Categories

Resources