I am creating a socket server to connect and speak with a C# program over TCP. Currently I am trying to create a way to convert the hex sent over the TCP socket to specific variables (the variable type will be in the packet header, and yes I do know tcp is a stream not technically sending packets but I am designing it like this). Currently I have all of the C# integral numeric types converting to and from bytesarray/integers correctly via the code below (All of the different types are the same with a couple edits to fit the c# type)
## SBYTE Type Class definition
## C#/Unity "sbyte" can be from -128 to 127
##
## Usage:
##
## Constructor
## variable = sbyte(integer)
## variable = sbyte(bytearray)
##
## Variables
## sbyte.integer (Returns integer representation)
## sbyte.bytes (Returns bytearray representation)
class sbyte:
def __init__(self, input):
if type(input) == type(int()):
self.integer = input
self.bytes = self.__toBytes(input)
elif type(input) == type(bytearray()):
self.bytes = input
self.integer = self.__toInt(input)
else:
raise TypeError(f"sbyte constructor can take integer or bytearray type not {type(input)}")
## Return Integer from Bytes Array
def __toInt(self, byteArray):
## Check that there is only 1 byte
if len(byteArray) != 1:
raise OverflowError(f"sbyte.__toInt length can only be 1 byte not {len(byteArray)} bytes")
## Return signed integer
return int.from_bytes(byteArray, byteorder='little', signed=True)
## Return Bytes Array from Integer
def __toBytes(self, integer):
## Check that the passed integer is not larger than 128 and is not smaller than -128
if integer > 127 or integer < -128:
raise ValueError(f"sbyte.__toBytes can only take an integer less than or equal to 127, and greater than or equal to -128, not \"{integer}\"")
## Convert the passed integer to Bytes
return integer.to_bytes(1, byteorder='little', signed=True)
This is working for all the types I currently implemented, but I do wonder if there is a better way to handle this? Such as using ctype's or some other python library. Since this will be a socket server with potentially many connections handling this as fast as possible is best. Or if there is anything else you see that I can improve I would love to know.
If all you want is an integer value from a byte array, simply index the byte array:
>>> b = bytearray.fromhex('1E')
>>> b[0]
30
After testing the differenced between from_bytes, struct.unpack, and numpy.frombuffer with the following code:
setup1 = """\
byteArray = bytearray.fromhex('1E')
"""
setup2 = """\
import struct
byteArray = bytearray.fromhex('1E')
"""
setup3 = """\
import numpy as np
type = np.dtype(np.byte)
byteArray = bytearray.fromhex('1E')
"""
stmt1 = "int.from_bytes(byteArray, byteorder='little', signed=True)"
stmt2 = "struct.unpack('b', byteArray)"
stmt3 = "np.frombuffer(byteArray, type)"
print(f"First statement execution time = {timeit.timeit(stmt=stmt1, setup=setup1, number=10**8)}")
print(f"Second statement execution time = {timeit.timeit(stmt=stmt2, setup=setup2, number=10**8)}")
print(f"Third statement execution time = {timeit.timeit(stmt=stmt3, setup=setup3, number=10**8)}")
Results:
First statement execution time = 14.456886599999999
Second statement execution time = 6.671141799999999
Third statement execution time = 21.8327342
From the initial results it looks like struct is the fastest way to accomplish this. Unless there are other libraries I am missing.
EDIT:
Per AKX's suggestion I added the following test for signed byte:
stmt4 = """\
if byteArray[0] <=127:
byteArray[0]
else:
byteArray[0]-127
"""
and got the following execution time:
Fourth statement execution time = 4.581732600000002
Going this path is the fastest although slightly over just using structs. I will have to test with each type for the fastest way to cast the bytes and vice versa but this question gave me 4 different ways to test each one now. Thanks!
Related
First off, I'm trying to do something like screenshare, I'm confronting an error. I tried to send the image row by row than I figured out that it could work only in LAN since the connection speed was far better. Then I transmitted exclusively each pixel and constructed the array on my own which would be pretty slow. Here is the code:
Server side:
def getImg(socket,y,x):
pixels = []
print(x*y*3)
for a in range(x*y*3):
pixels.append(struct.unpack('i', socket.recv(4))[0])
if a%(x*y)==0:
pass
return np.array(pixels).reshape(y,x,3)
Client side:
def sendSS(img):
y, x, color = img.shape
print(y*x*3)
for a in range(y):
for b in range(x):
for m in img[a][b]:
socket.send(struct.pack('i', m))
The error is this:
(<class 'cv2.error'>, error("OpenCV(4.5.1) c:\\users\\appveyor\\appdata\\local\\temp\\1\\pip-req-build-wvn_it83\\opencv\\modules\\imgproc\\src\\color.simd_helpers.hpp:94: error: (-2:Unspecified error) in function '__cdecl cv::impl::`anonymous-namespace'::CvtHelper<struct cv::impl::`anonymous namespace'::Set<3,4,-1>,struct cv::impl::A0xeee51b91::Set<3,4,-1>,struct cv::impl::A0xeee51b91::Set<0,2,5>,2>::CvtHelper(const class cv::_InputArray &,const class cv::_OutputArray &,int)'\n> Unsupported depth of input image:\n> 'VDepth::contains(depth)'\n> where\n> 'depth' is 4 (CV_32S)\n"), <traceback object at 0x0000023221307740>)
The way I constructed the numpy array works fine:
import pyautogui
import cv2
import numpy as np
import pickle
import struct
from pprint import pprint as pp
ss = np.array(pyautogui.screenshot())
y,x,color = ss.shape
pixels = []
for a in range(y):
for b in range(x):
for m in ss[a][b]:
pixels.append(struct.pack('i',m))
pixels2 = []
for val in pixels:
pixels2.append(struct.unpack('i',val)[0])
pixels2 = np.array(pixels2).reshape((1080,1920,3))
cv2.imwrite('name.png',pixels2)
How can I transmit faster and solve this problem?
edit:
Here is the part of the code where the error occurs, it says the depth is 4 but I'm sure the dimensions of the image are 1080,1920,3 because I can print it.
image = getImg(aSocket,y,x)
SS = cv2.cvtColor(image,cv2.COLOR_RGB2BGR) # error occurs in this line
aSocket.send('image is gotten'.encode('utf-8'))
print('image is gotten')
cv2.imwrite(f'{ip[0]}\{count}.png',SS)
You need to pass the new shape as tuple instead for reshape, eg. arr.reshape((y,x,3)). You might also get another error if OpenCV expects different data type (8 bits per channel, unsigned) than what it got from your program (32 bit signed integer), see documentation for cvtColor(). You can specify the data type when you create the array np.array(..., dtype=np.uint8).
Regarding performance, in Python for loops over large amounts of data can be slow and often you can make your code significantly faster if you can avoid the loop and use library routine instead. Also, you should read and write larger chunks with send() and recv(). If you are not familiar with socket programming, you should at minimum check out the python documentation Socket Programming HOWTO. You eliminate the loops over the image data by directly interpreting the data as bytes. The socket code follows the examples in linked documentation page.
def send_im(sock, im):
totalsent = 0
im_bytes = im.tobytes()
while totalsent < len(im_bytes):
sent = sock.send(im_bytes[totalsent:])
if sent == 0:
raise RuntimeError("socket error")
totalsent += sent
Receiver needs to know the size of expected input.
def recv_im(sock, dtype, shape):
# expected number of bytes
size = np.dtype(float).itemsize * np.prod(shape)
im_bytes = bytearray(size) # received bytes
bytes_recvd = 0
chunk_size = 2048
while bytes_recvd < size:
chunk = sock.recv(min(size-bytes_recvd, chunk_size))
if chunk == b'':
raise RuntimeError("socket connection broken")
im_bytes[bytes_recvd:(bytes_recvd+chunk_size)] = memoryview(chunk)
bytes_recvd += len(chunk)
# intrepret bytes in correct data type and shape and return image
return np.frombuffer(im_bytes, dtype=dtype).reshape(shape)
In general, you should always use a serialization library instead of directly sending raw bytes over network, as those libraries usually handle various important details such as byte order, message sizes, data types, conversions, and so on.
I am working on a server engine in Python, for my game made in GameMaker Studio 2. I'm currently having some issues with making and sending a packet.
I've successfully managed to establish a connection and send the first packet, but I can't find a solution for sending data in a sequence of which if the first byte in the packed struct is equal to a value, then unpack other data into a given sequence.
Example:
types = 'hhh' #(message_id, x, y) example
message_id = 0
x = 200
y = 200
buffer = pack(types, 0,x, y)
On the server side:
data = conn.recv(BUFFER_SIZE)
mid = unpack('h', data)[0]
if not data: break
if mid == 0:
sequnce = 'hhh'
x = unpack(sequnce, data)[1]
y = unpack(sequnce, data)[2]
It looks like your subsequent decoding is going to vary based on the message ID?
If so, you will likely want to use unpack_from which allows you to pull only the first member from the data (as written now, your initial unpack call will generate an exception because the buffer you're handing it is not the right size). You can then have code that varies the unpacking format string based on the message ID. That code could look something like this:
from struct import pack, unpack, unpack_from
while True:
data = conn.recv(BUFFER_SIZE)
# End of file, bail out of loop
if not data: break
mid = unpack_from('!h', data)[0]
if mid == 0:
# Message ID 0
types = '!hhh'
_, x, y = unpack(types, data)
# Process message type 0
...
elif mid == 1:
types = '!hIIq'
_, v, w, z = unpack(types, data)
# Process message type 1
...
elif mid == 2:
...
Note that we're unpacking the message ID again in each case along with the ID-specific parameters. You could avoid that if you like by using the optional offset argument to unpack_from:
x, y = unpack_from('!hh', data, offset=2)
One other note of explanation: If you are sending messages between two different machines, you should consider the "endianness" (byte ordering). Not all machines are "little-endian" like x86. Accordingly it's conventional to send integers and other structured numerics in a certain defined byte order - traditionally that has been "network byte order" (which is big-endian) but either is okay as long as you're consistent. You can easily do that by prepending each format string with '!' or '<' as shown above (you'll need to do that for every format string on both sides).
Finally, the above code probably works fine for a simple "toy" application but as your program increases in scope and complexity, you should be aware that there is no guarantee that your single recv call actually receives all the bytes that were sent and no other bytes (such as bytes from a subsequently sent buffer). In other words, it's often necessary to add a buffering layer, or otherwise ensure that you have received and are operating on exactly the number of bytes you intended.
Could you unpack whole data to list, and then check its elements in the loop? What is the reason to unpack it 3 times? I guess, you could unpack it once, and then work with that list - check its length first, if not empty -> check first element -> if equal to special one, continue on list parsing. Did you try like that?
I have been using python for my assignments for past few days. I noticed one strange thing that -
When I convert string to float - it gives exactly same number of digits that were in string.
When I put this number in file using struct.pack() with 4 bytes floats and read it back using struct.unpack(), it gives a number not exactly same but some longer string which I expect if as per the floating point storage
Ex. - String - 0.931973
Float num - 0.931973
from file - 0.931972980499 (after struct pack and unpack into 4 bytes)
So I am unable to understand how python actually stored my number previously when I read it from string.
EDIT
Writing the float (I think in python 2.7 on ubuntu its other way around, d- double and f-float)
buf = struct.pack("f", float(self.dataArray[i]))
fout.write(buf)
Query -
buf = struct.pack("f", dataPoint)
dataPoint = struct.unpack("f", buf)[0]
node = root
while(node.isBPlusNodeLeaf()) == False:
node = node.findNextNode(dataPoint)
findNextNode -
def findNextNode(self, num):
i = 0
for d in self.dataArray:
if float(num) > float(d):
i = i + 1
continue
else:
break
ptr = self.pointerArray[i]
#open the node before passing on the pointer to it
out, tptr = self.isNodeAlive(ptr)
if out == False:
node = BPlusNode(name = ptr)
node.readBPlusNode(ptr)
return node
else:
return BPlusNode.allNodes[tptr]
once I reach to leaf it reads the leaf and check if the datapoint exist there.
for data in node.dataArray:
if data == dataPoint:
return True
return False
So in this case it returns unsuccessful search for datapoint - 0.931972980499 which is there although.
While following code works fine -
for data in node.dataArray:
if round(float(data), 6) == dataPoint:
return True
return False
I am not able to understand why this is happening
float in Python is actually what C programmers call double, i.e. it is 64 bits (or perhaps even wider on some platforms). So when you store it in 4 bytes (32 bits), you lose precision.
If you use the d format instead of f, you should see the results you expect:
>>> struct.unpack('d', struct.pack('d', float('0.931973')))
(0.931973,)
In Java, I can encode a BigInteger as:
java.math.BigInteger bi = new java.math.BigInteger("65537L");
String encoded = Base64.encodeBytes(bi.toByteArray(), Base64.ENCODE|Base64.DONT_GUNZIP);
// result: 65537L encodes as "AQAB" in Base64
byte[] decoded = Base64.decode(encoded, Base64.DECODE|Base64.DONT_GUNZIP);
java.math.BigInteger back = new java.math.BigInteger(decoded);
In C#:
System.Numerics.BigInteger bi = new System.Numerics.BigInteger("65537L");
string encoded = Convert.ToBase64(bi);
byte[] decoded = Convert.FromBase64String(encoded);
System.Numerics.BigInteger back = new System.Numerics.BigInteger(decoded);
How can I encode long integers in Python as Base64-encoded strings? What I've tried so far produces results different from implementations in other languages (so far I've tried in Java and C#), particularly it produces longer-length Base64-encoded strings.
import struct
encoded = struct.pack('I', (1<<16)+1).encode('base64')[:-1]
# produces a longer string, 'AQABAA==' instead of the expected 'AQAB'
When using this Python code to produce a Base64-encoded string, the resulting decoded integer in Java (for example) produces instead 16777472 in place of the expected 65537. Firstly, what am I missing?
Secondly, I have to figure out by hand what is the length format to use in struct.pack; and if I'm trying to encode a long number (greater than (1<<64)-1) the 'Q' format specification is too short to hold the representation. Does that mean that I have to do the representation by hand, or is there an undocumented format specifier for the struct.pack function? (I'm not compelled to use struct, but at first glance it seemed to do what I needed.)
Check out this page on converting integer to base64.
import base64
import struct
def encode(n):
data = struct.pack('<Q', n).rstrip('\x00')
if len(data)==0:
data = '\x00'
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
def decode(s):
data = base64.urlsafe_b64decode(s + '==')
n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
return n[0]
The struct module:
… performs conversions between Python values and C structs represented as Python strings.
Because C doesn't have infinite-length integers, there's no functionality for packing them.
But it's very easy to write yourself. For example:
def pack_bigint(i):
b = bytearray()
while i:
b.append(i & 0xFF)
i >>= 8
return b
Or:
def pack_bigint(i):
bl = (i.bit_length() + 7) // 8
fmt = '<{}B'.format(bl)
# ...
And so on.
And of course you'll want an unpack function, like jbatista's from the comments:
def unpack_bigint(b):
b = bytearray(b) # in case you're passing in a bytes/str
return sum((1 << (bi*8)) * bb for (bi, bb) in enumerate(b))
This is a bit late, but I figured I'd throw my hat in the ring:
def inttob64(n):
"""
Given an integer returns the base64 encoded version of it (no trailing ==)
"""
parts = []
while n:
parts.insert(0,n & limit)
n >>= 32
data = struct.pack('>' + 'L'*len(parts),*parts)
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
def b64toint(s):
"""
Given a string with a base64 encoded value, return the integer representation
of it
"""
data = base64.urlsafe_b64decode(s + '==')
n = 0
while data:
n <<= 32
(toor,) = struct.unpack('>L',data[:4])
n |= toor & 0xffffffff
data = data[4:]
return n
These functions turn an arbitrary-sized long number to/from a big-endian base64 representation.
Here is something that may help. Instead of using struct.pack() I am building a string of bytes to encode and then calling the BASE64 encode on that. I didn't write the decode, but clearly the decode can recover an identical string of bytes and a loop could recover the original value. I don't know if you need fixed-size integers (like always 128-bit) and I don't know if you need Big Endian so I left the decoder for you.
Also, encode64() and decode64() are from #msc's answer, but modified to work.
import base64
import struct
def encode64(n):
data = struct.pack('<Q', n).rstrip('\x00')
if len(data)==0:
data = '\x00'
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
def decode64(s):
data = base64.urlsafe_b64decode(s + '==')
n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
return n[0]
def encode(n, big_endian=False):
lst = []
while True:
n, lsb = divmod(n, 0x100)
lst.append(chr(lsb))
if not n:
break
if big_endian:
# I have not tested Big Endian mode, and it may need to have
# some initial zero bytes prepended; like, if the integer is
# supposed to be a 128-bit integer, and you encode a 1, you
# would need this to have 15 leading zero bytes.
initial_zero_bytes = '\x00' * 2
data = initial_zero_bytes + ''.join(reversed(lst))
else:
data = ''.join(lst)
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
print encode(1234567890098765432112345678900987654321)
I need a python program to use a TCP connection(to another program) to retrieve sets of 9 bytes.
The first of the nine bytes represents a char, and the rest represent a double.
How can I extract this information from a python socket? Will I have to do the maths manually to convert the stream data or is there a better way?
Take a look at python struct
http://docs.python.org/library/struct.html
So something like
from struct import unpack
unpack('cd',socket_read_buffer)
--> ('c', 3.1415)
Be careful of Endianness.
If both client and server are written in python, I'd suggest you use pickle. It lets you convert python variables to bytes and then back to python variables with their original type.
#SENDER
import pickle, struct
#convert the variable to bytes with pickle
bytes = pickle.dumps(original_variable)
#convert its size to a 4 bytes int (in bytes)
#I: 4 bytes unsigned int
#!: network (= big-endian)
length = struct.pack("!I", len(bytes))
a_socket.sendall(length)
a_socket.sendall(bytes)
#RECEIVER
import pickle, struct
#This function lets you receive a given number of bytes and regroup them
def recvall(socket, count):
buf = b""
while count > 0:
newbuf = socket.recv(count)
if not newbuf: return None
buf += newbuf
count -= len(newbuf)
return buf
length, = struct.unpack("!I", recvall(socket, 4)) #we know the first reception is 4 bytes
original_variable = pickle.loads(recval(socket, length))