Difference between BCH code in MATLAB and python - python

I have to implement a BCH error-correcting code. I have found some codes in Python BCH library Python and MATLAB BCH encoder in MATLAB. However, codes have different performance, BCH(127,70) in Python can correct up to 70 bitflips in a block size of 127. However, the MATLAB code can correct up to only 15 bits in 127 bits in BCH(127,15).
Why do these implementation perform differently?
Python Code
import bchlib
import hashlib
import os
import random
# create a bch object
BCH_POLYNOMIAL = 8219
BCH_BITS = 72
bch = bchlib.BCH(BCH_POLYNOMIAL, BCH_BITS)
# random data
data = bytearray(os.urandom(127))
# encode and make a "packet"
ecc = bch.encode(data)
packet = data + ecc
# print length of ecc, data, and packet
print('data size: %d' % (len(data)))
print('ecc size: %d' % (len(ecc)))
print('packet size: %d' % (len(packet)))
# print hash of packet
sha1_initial = hashlib.sha1(packet)
print('sha1: %s' % (sha1_initial.hexdigest(),))
def bitflip(packet):
byte_num = random.randint(0, len(packet) - 1)
bit_num = random.randint(0, 7)
packet[byte_num] ^= (1 << bit_num)
# make BCH_BITS errors
for _ in range(BCH_BITS):
bitflip(packet)
# print hash of packet
sha1_corrupt = hashlib.sha1(packet)
print('sha1: %s' % (sha1_corrupt.hexdigest(),))
# de-packetize
data, ecc = packet[:-bch.ecc_bytes], packet[-bch.ecc_bytes:]
# correct
bitflips = bch.decode_inplace(data, ecc)
print('bitflips: %d' % (bitflips))
# packetize
packet = data + ecc
# print hash of packet
sha1_corrected = hashlib.sha1(packet)
print('sha1: %s' % (sha1_corrected.hexdigest(),))
if sha1_initial.digest() == sha1_corrected.digest():
print('Corrected!')
else:
print('Failed')
This outputs
data size: 127
ecc size: 117
packet size: 244
sha1: 4ee71f947fc5d561b211a551c87fdef18a83404b
sha1: a072664312114fe59f5aa262bed853e35d70d349
bitflips: 72
sha1: 4ee71f947fc5d561b211a551c87fdef18a83404b
Corrected!
MATLAB code
%% bch params
M = 7;
n = 2^M-1; % Codeword length
k = 15; % Message length
nwords = 2; % Number of words to encode
% create a msg
msgTx = gf(randi([0 1],nwords,k));
%disp(msgTx)
%Find the error-correction capability.
t = bchnumerr(n,k)
% Encode the message.
enc = bchenc(msgTx,n,k);
%Corrupt up to t bits in each codeword.
noisycode = enc + randerr(nwords,n,1:t);
%Decode the noisy code.
msgRx = bchdec(noisycode,n,k);
% Validate that the message was properly decoded.
isequal(msgTx,msgRx)
which outputs:
t = 27
ans = logical 1
Increasing k>15 in MATLAB code gives following error:
Error using bchnumerr (line 72)
The values for N and K do not produce a valid narrow-sense BCH code.
Error in bchTest (line 10)
t = bchnumerr(n,k)

I discovered this question today (24 January 2021) as I searched for other information about BCH codes.
See Appendix A: Code Generators for BCH Codes (pdf) of Error-Correction Coding for Digital Communications by George C. Clark and J. Bibb Cain:
For n = 127 and k = 15, t = 27 is the number of errors that can be corrected.
For n = 127, the next option with larger k is k = 22 and t = 23.
Your use of the Python library is confusing. For standard usage of BCH codes, the length of a codeword is equal to 2m - 1 for some positive integer m. The codeword in your example is not of this form.
I have not used that Python library, so I cannot write with certainty. If ecc is of length 127, then I suspect that it is a codeword. Concatenating ecc and data yields a packet that has a copy of the original message data as well as a copy of the codeword. This is not how BCH codes are used. When you have the codeword, you don't need to send it and a separate copy of the original message.
If you do read the reference linked above, be aware of the notation used to describe the polynomials. For the n = 127 table, the polynomial g1(x) is denoted by 211, which is octal notation. The nonzero bits in the binary expressions indicate the nonzero coefficients of the polynomial.
octal: 211
binary: 010 001 001
polynomial: x7 + x3 + 1
The polynomial g2(x) is equal to g1(x) multiplied by another polynomial:
octal: 217
binary: 010 001 111
polynomial: x7 + x3 + x2 + x + 1
This means that
g2(x) = (x7 + x3 + 1)(x7 + x3 + x2 + x + 1)
Each gt+1(x) is equal to gt(x) multiplied by another polynomial.

Related

Separate Data From Gyroscope Serial Port, Python

I have a Bluetooth gyroscope that I only want accelerometer data from. When I open the port the data will come in as a single, jumbled stream, right? How do I grab the data that I want? I want to simulate a keypress if acceleration is over a certain value, if that helps.
Since the data packet is 11 bytes, read 11 bytes at a time from the port, then parse the message. Note, you may want to read 1 byte at a time until you get a start-message byte (0x55) then read the following 10.
# data is a byte array, len = 11
def parse_packet(data):
# Verify message
if data[0] == 0x55 and data[1] == 0x51 and len(data) == 11:
# Verify checksum
if ((sum(data) - data[10]) & 0xFF) == data[10]:
g = 9.8 # Gravity
parsed = {}
parsed["Ax"] = ((data[3] << 8) | data[2]) / 32768.0 * 16 * g
parsed["Ay"] = ((data[5] << 8) | data[4]) / 32768.0 * 16 * g
parsed["Az"] = ((data[7] << 8) | data[6]) / 32768.0 * 16 * g
# Temp in in degrees celsius
parsed["Temp"] = ((data[9] << 8) | data[8]) / 340.0 + 36.53
return parsed
return None
The one thing you need to verify is the checksum calculation. I couldn't find it in the manual. The other calculations came from the manual I found here: https://www.manualslib.com/manual/1303256/Elecmaster-Jy-61-Series.html?page=9#manual

Restore corrupt 128-bit key from SHA-1

Disclaimer: This is a section from a uni assignment
I have been given the following AES-128-CBC key and told that up to 3 bits in the key have been changed/corrupt.
d9124e6bbc124029572d42937573bab4
The original key's SHA-1 hash is provided;
439090331bd3fad8dc398a417264efe28dba1b60
and I have to find the original key by trying all combinations of up to 3 bit flips.
Supposedly this is possible in 349633 guesses however I don't have a clue where that number came from; I would have assumed it would be closer to 128*127*126 which would be over 2M combinations, that's where my first problem lies.
Secondly, I created the python script below containing a triple nested loop (I know, far from the best code...) to iterate over all 2M possibilities however, after completion an hour later, it hadn't found any matches which I really don't understand.
Hoping someone can atleast point me in the right direction, cheers
#!/usr/bin/python2
import sys
import commands
global binary
def inverseBit(index):
global binary
if binary[index] == "0":
return "1"
return "0"
if __name__ == '__main__':
if len(sys.argv) != 3:
print "Usage: bitflip.py <hex> <sha-1>"
sys.exit()
global binary
binary = ""
sha = str(sys.argv[2])
binary = str(bin(int(sys.argv[1], 16)))
binary = binary[2:]
print binary
b2 = binary
tries = 0
file = open("shas", "w")
for x in range(-2, 128):
for y in range(-1,128):
for z in range(0,128):
if x >= 0:
b2 = b2[:x] + inverseBit(x) + b2[x+1:]
if y >= 0:
b2 = b2[:y] + inverseBit(y) + b2[y+1:]
b2 = b2[:z] + inverseBit(z) + b2[z+1:]
#print b2
hexOut = hex(int(b2,2))
command = "echo -n \"" + hexOut + "\" | openssl sha1"
cmdOut = str(commands.getstatusoutput(command))
cmdOut = cmdOut[cmdOut.index('=')+2:]
cmdOut = cmdOut[:cmdOut.index('\'')]
file.write(str(hexOut) + " | " + str(cmdOut) + "\n")
if len(cmdOut) != 40:
print cmdOut
if cmdOut == sha:
print "Found bit reversals in " + str(tries) + " tries. Corrected key:"
print hexOut
sys.exit()
b2 = binary
tries = tries + 1
if tries % 10000 == 0:
print tries
EDIT:
Changing for loop to
for x in range(-2, 128):
for y in range(x+1,128):
for z in range(y+1,128):
drastically cuts down on the number of guesses while (I think?) still covering the whole space. Still getting some duplicates and still no luck finding the match though..
Your code, if not very efficient, looks fine except for one thing:
hexOut = hex(int(b2,2))
as the output of hex
>>> hex(int('01110110000101',2))
'0x1d85'
starts with 'Ox', which shouldn't be part of the key. So, you should be fine by removing these two characters.
For the number of possible keys to try, you have:
1 with no bit flipped
128 with 1 bit flipped
128*127/2 = 8128 with 2 bits flipped (128 ways to choose the first one, 127 ways to choose the second, and each pair will appear twice)
128*127*126/6 = 341376 with 3 bits flipped (each triplet appears 6 times). This is the number of combinations of 128 bits taken 3 at a time.
So, the total is 1 + 128 + 8128 + 341376 = 349633 possibilities.
Your code tests each of them many times. You could avoid a the useless repetitions by looping like this (for 3 bits):
for x in range (0, 128):
for y in range(x+1, 128):
for z in range(y+1, 128):
.....
You could adapt your trick of starting at -2 with:
for x in range (-2, 128):
for y in range(x+1, 128):
for z in range(y+1, 128):
.... same code you used ...
You could also generate the combinations with itertools.combinations:
from itertools import combinations
for x, y, z in combinations(range(128), 3): # for 3 bits
......
but you'd need a bit more work to manage the cases with 0, 1, 2 and 3 flipped bits in this case.

Telegram API files uploading

Telegram documentation says the following about files ID:
The file’s binary content is then split into parts. All parts must
have the same size (part_size) and the following conditions must be
met:
part_size % 1024 = 0 (divisible by 1KB)
524288 % part_size = 0 (512KB must be evenly divisible by part_size)
The last part does not have to satisfy these conditions, provided its
size is less than part_size. Each part should have a sequence number,
file_part, with a value ranging from 0 to 2,999.
My code:
def check_conditions(file_name):
b = False
file_binary_data = open("D:\\" + file_name, "br").read()
length = len(bytearray(file_binary_data))
print(file_name + ", size: " + str(length) + " bytes")
for i in range(1, 3000):
part = length // i
if part % 1024 == 0 and 524288 % part == 0:
print("i: " + str(i) + " | part size: " + str(part))
b = True
if not b:
print("No mathces")
print()
check_conditions("The White Stripes - Truth Doesn't Make A Noise.mp3")
check_conditions("Depeche Mode - Precious.mp3")
check_conditions("Placebo - Meds.mp3")
Output:
The White Stripes - Truth Doesn't Make A Noise.mp3, size: 7782220 bytes
No mathces
Depeche Mode - Precious.mp3, size: 10298248 bytes
i: 1257 | part size: 8192
i: 2514 | part size: 4096
Placebo - Meds.mp3, size: 11808625 bytes
No mathces
Where is mistake? Or if all's ok, what to do with files that don't meet?
You are getting it wrong.
You are simply to divide your file into pieces of equal sizes.
"The White Stripes - Truth Doesn't Make A Noise.mp3" , size: 7782220
bytes
Say you are using the MAX piece size of 512k (ie. 524288), then you simply have:
7782220 / 524288 = 14 rem 442188
Hence you have 14 pieces of 512k bytes and the last piece of 442188 bytes.
Apply same logic to the other files.

Problems with Smoothing graphs in Python

I have been trying to smooth a plot which is noisy due to the sampling rate I'm using, and what it's counting. I've been using the help on here - mainly Plot smooth line with PyPlot (although I couldn't find the "spline" function and so am using UnivarinteSpline instead)
However, whatever I do I keep getting errors with either the pyplot error that "x and y are not of the same length" or, that the scipi.UnivariateSpline has a value for w that is incorrect. I am not sure quite how to fix this (not really a Python person!) I've attached the code although it's just the plotting bit at the end that is causing problems. Thanks
import os.path
import matplotlib.pyplot as plt
import scipy.interpolate as sci
import numpy as np
def main():
jcc = "0050"
dj = "005"
l = "060"
D = 20
hT = 4 * D
wT1 = 2 * D
wT2 = 5 * D
for jcm in ["025","030","035","040","045","050","055","060"]:
characteristic = "LeadersOnly/Jcm" + jcm + "/Jcc" + jcc + "/dJ" + dj + "/lambda" + l + "/Seed000"
fingertime1 = []
fingertime2 = []
stamp =[]
finger=[]
for x in range(0,2500,50):
if x<10000:
z=("00"+str(x))
if x<1000:
z=("000"+str(x))
if x<100:
z=("0000"+str(x))
if x<10:
z=("00000"+str(x))
stamp.append(x)
path = "LeadersOnly/Jcm" + jcm + "/Jcc" + jcc + "/dJ" + dj + "/lambda" + l + "/Seed000/profile_" + str(z) + ".txt"
if os.path.exists(path):
f = open(path, 'r')
pr1,pr2=np.genfromtxt(path, delimiter='\t', unpack=True)
p1=[]
p2=[]
h1=[]
h2=[]
a1=[]
a2=[]
finger1 = 0
finger2 = 0
for b in range(len(pr1)):
p1.append(pr1[b])
p2.append(pr2[b])
for elem in range(len(pr1)-80):
h1.append((p1[elem + (2*D)]-0.5*(p1[elem]+p1[elem + (4*D)])))
h2.append((p2[elem + (2*D)]-0.5*(p2[elem]+p2[elem + (4*D)])))
if h1[elem] >= hT:
a1.append(1)
else:
a1.append(0)
if h2[elem]>=hT:
a2.append(1)
else:
a2.append(0)
for elem in range(len(a1)-1):
if (a1[elem] - a1[elem + 1]) != 0:
finger1 = finger1 + 1
finger1 = finger1 / 2
for elem in range(len(a2)-1):
if (a2[elem] - a2[elem + 1]) != 0:
finger2 = finger2 + 1
finger2 = finger2 / 2
fingertime1.append(finger1)
fingertime2.append(finger2)
finger.append((finger1+finger2)/2)
namegraph = jcm
stampnew = np.linspace(stamp[0],stamp[-1],300)
fingernew = sci.UnivariateSpline(stamp, finger, stampnew)
plt.plot(stampnew,fingernew,label=namegraph)
plt.show()
main()
For information, the data input files are simply a list of integers (two lists seperated by tabs, as the code suggests).
Here is one of the error codes that I get:
0-th dimension must be fixed to 50 but got 300
error Traceback (most recent call last)
/group/data/Cara/JCMMOTFingers/fingercount_jcm_smooth.py in <module>()
116
117 if __name__ == '__main__':
--> 118 main()
119
120
/group/data/Cara/JCMMOTFingers/fingercount_jcm_smooth.py in main()
93 #print(len(stamp))
94 stampnew = np.linspace(stamp[0],stamp[-1],300)
---> 95 fingernew = sci.UnivariateSpline(stamp, finger, stampnew)
96 #print(len(stampnew))
97 #print(len(fingernew))
/usr/lib/python2.6/dist-packages/scipy/interpolate/fitpack2.pyc in __init__(self, x, y, w, bbox, k, s)
86 #_data == x,y,w,xb,xe,k,s,n,t,c,fp,fpint,nrdata,ier
87 data = dfitpack.fpcurf0(x,y,k,w=w,
---> 88 xb=bbox[0],xe=bbox[1],s=s)
89 if data[-1]==1:
90 # nest too small, setting to maximum bound
error: failed in converting 1st keyword `w' of dfitpack.fpcurf0 to C/Fortran array
Let's analyze your code a bit, starting from the for x in range(0, 2500, 50):
You define z as a string of 6 digits padded with 0s. You should really use somestring formatting like z = "{0:06d}".format(x) or z = "%06d" % x instead of these multiple tests of yours.
At the end of your loop, stamp will have (2500//50)=50 elements.
You check for the existence of your file path, then open it and read it, but you never close it. A more Pythonic way is to do:
try:
with open(path,"r") as f:
do...
except IOError:
do something else
With the with syntax, your file is automatically closed.
pr1 and pr2 are likely to be 1D arrays, right? You can really simplify the construction of your p1 and p2 lists as:
p1 = pr1.tolist()
p2 = pr2.tolist()
Your lists a1, a2 have the same size: you could combine your for elem in range(len(a..)-1) loops in a single one. You could also use the np.diff function.
at the end of the for x in range(...) loops, finger will have 50 elements minus the number of missing files. As you're not telling what to do in case of a missing file, your stamp and finger lists may not have the same number of elements, which will crash your scipy.UnivariateSpline. An easy fix would be to update your stamp list only if the path file is defined (that way, it always has the same number of elements as finger).
Your stampnew array has 300 elements, when your stamp and finger can only have at most 50. That's a second problem, the size of the weight array (stampnew) must be the same as the size of the inputs.
You're eventually trying to plot fingernew vs stamp. The problem is that fingernew is not an array, it's an instance of UnivariateSpline. You still need to calculate some actual points, for example with fingernew(stamp), then use that in your plot function.

All Possible combination for an HEX Value from a given set of chars

I am new to python and programming,
I am looking for a code, or a sample code that can have a predefined set of hex values
and that can find the 3 used values within to generate a certain value.
lets say I have a value of : 0x50158A51
this is a 4 byte (32 bit) hex value
now i need to find the values which when added or subtracted (from the provided set) will end with this result.
for example:
0x75612171 + 0x75612171 + 0x6553476F = 0x50158A51
notice that the values added are all from the allowed set
Just to be clear i have a limited chars set
which is :
\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13
\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26
\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39
\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c
\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f
\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72
\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85
\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98
\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab
\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe
\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1
\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4
\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7
\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff
i used a simple code to calculate 3 values:
#!/usr/bin/python
hex1 = 0x55555521
hex2 = 0x55555421
hex3 = 0x6D556F49
calc = hex1 + hex2 + hex3
print hex(calc)
which will give a result of:
root#linux:~# ./calc2.py
0x150158a51
i need to some how reverse the process of the answer by placing variations from my allowed set into the variables
for example:
placing 4 byte hex values from the set into the variables
try:
hex1 = placing 4bytes from allowed set
hex2 = placing 4bytes from allowed set
hex3 = placing 4bytes from allowed set
if result (hex1+hex2+hex3) = 0x150158a51
then
print "values used for this results are: hex1 hex2 hex3"
Thank you in advance.
What you're asking for isn't possible. There will be infinite sequences of numbers that when added together will continue to produce the same result, modulo 2^32.
As a trivial example, say that your target number is 0x10000000 and the only hex values you allow are zero and one. Then the following sequences of numbers will result in 0x10000000:
0x1 + 0x1 + ... + 0x1 (0x10000000 times) = 0x10000000
0x1 + 0x1 + ... + 0x1 (0x110000000 times) = 0x10000000
0x1 + 0x1 + ... + 0x1 (0x210000000 times) = 0x10000000
and so on. Since you can continue adding 0x1's indefinitely, the algorithm can never terminate.
The following program for 0x50158A51 generates:
0x50157f51 + 0x00000b00 + 0x00000000 = 0x50158A51
for 0x1090F0FF it generates:
0x107f7f7f + 0x000011717f + 0x00000001 = 0x1090f0ff
where all "characters" in summands are from allowed set and not from disallowed set.
The program:
a=0x1090F0FF
a0=0
a1=0
a2=0
for n in range(3,-1,-1):
a0<<=8;
a1<<=8;
a2<<=8;
mask = 0xff<<(n*8)
b=(a&mask)>>(n*8)
if b > 2*0x7f:
a0 += 0x7f
a1 += 0x7f
a2 += b - 2*0x7f
elif b > 0x7f:
a0 += 0x7f
a1 += b - 0x7f
else:
a0 += b
print '%08x + %08x + %08x = %08x' % (a0, a1, a2, a0+a1+a2)
From what I understood, but I may be wrong, you are talking about variation of a Subset Sum Problem, which is NP-Complete. So you may look for some more info about that.

Categories