I am using a blocking python socket of the type socket.socket(socket.AF_INET, socket.SOCK_STREAM) to send messages from my client to my server. If I send messages in quick succession (but not simultaneously), I get the following error on my server:
in receive
size = int(rec_sock.recv(HEADER_SIZE).decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
Before each message I send a header with the length of the following message. The header is encoded in UTF-8 by the client and therefore shouldn't throw this error. The header is also the only part of the message that the client attempts to decode with UTF-8 so I am not sure how this error can happen.
I am using the following methods to send, receive, and make a header:
BUF_SIZE = 16384
HEADER_SIZE = 16
def receive(rec_sock: socket.socket) -> Any:
message = b''
size = int(rec_sock.recv(HEADER_SIZE).decode('utf-8'))
if size:
while len(message) < size:
data = rec_sock.recv(BUF_SIZE)
message += data
return pickle.loads(message) if len(message) else None
def send(resp: Any, send_sock: socket.socket) -> None:
pickeled = pickle.dumps(resp)
send_sock.send(make_header(len(pickeled)))
send_sock.send(pickeled)
def make_header(msg_size: int) -> bytes:
encoded = str(msg_size).encode('utf-8')
return b'0' * (HEADER_SIZE - len(encoded)) + encoded
The issue was that I am always filling the entire buffer in my receive method, even if the length of the remaining message is less than the buffer size. Because of this, if two messages were sent consecutively in a short time frame, the header of the next message was read by the previous call to receive and the actual content of the next message is read as the header (which cannot be decoded by utf-8).
Changing the receive method to this fixed it:
def receive(rec_sock: socket.socket) -> Any:
message = b''
size = int(rec_sock.recv(HEADER_SIZE).decode('utf-8'))
print("Waiting for", size, "bytes ...")
if size:
while len(message) < size:
remaining = size - len(message)
read_len = BUF_SIZE if remaining >= BUF_SIZE else remaining
data = rec_sock.recv(read_len)
message += data
print("Received", len(message), "bytes.")
return pickle.loads(message) if len(message) else None
I'm working on a program that receives a string from an Android app sent through WiFi, the program was originally written for Python 2.7, but after adding some additional functionalities I changed it to Python 3.7. However, after making that change, my data had an extra letter at the front and for the life of me I can't figure out why that is.
Here's a snippet of my code, it's a really simple if statement to see which command was sent from the Android app and controls Raspberry Pi (4) cam (v.2) with the command.
This part sets up the connections and wait to see which command I send.
isoCmd = ['auto','100','200','300','400','500','640','800']
HOST = ''
PORT = 21567
BUFSIZE = 1024
ADDR = (HOST,PORT)
brightness = 50
timelapse = 0
tcpSerSock = socket(AF_INET, SOCK_STREAM)
tcpSerSock.bind(ADDR)
tcpSerSock.listen(5)
while True:
print ('Waiting for connection')
tcpCliSock,addr = tcpSerSock.accept()
try:
while True:
data = ''
brightness = ' '
data = tcpCliSock.recv(BUFSIZE)
dataStr = str(data[1:])
print ("Here's data ",dataStr)
if not data:
break
if data in isoCmd:
if data == "auto":
camera.iso = 0
print ('ISO: Auto')
else:
camera.iso = int(data)
print ('ISO: '), data
When I start the program this is what I see:
Waiting for connection
#If I send command '300'
Here's data b'300'
Here's data b''
Waiting for connection
I'm not sure why there's the extra b'' is coming from. I have tested the code by just adding the "b" at the beginning of each items in the array which worked for any commands that I defined, not for any commands to control the Pi camera since well, there's no extra b at the beginning. (Did that make sense?) My point is, I know I'm able to send commands no problem, just not sure how to get rid of the extra letter. If anyone could give me some advice that would be great. Thanks for helping.
Byte strings are represented by the b-prefix.
Although you can see the string in output on printing, inherently they are bytes.
To get a normal string out of it, decode function can help.
dataStr.decode("utf-8")
b'data' simply means the data inside quotes has been received in bytes form, as mentioned in other answers also, you have to decode that with decode('utf-8') to get it in string form.
I have updated your program below, to be compatible for v3.7+
from socket import *
isoCmd = ['auto','100','200','300','400','500','640','800']
HOST = ''
PORT = 21567
BUFSIZE = 1024
ADDR = (HOST,PORT)
brightness = 50
timelapse = 0
tcpSerSock = socket(AF_INET, SOCK_STREAM)
tcpSerSock.bind(ADDR)
tcpSerSock.listen(5)
while True:
print ('Waiting for connection')
tcpCliSock,addr = tcpSerSock.accept()
try:
while True:
data = ''
brightness = ' '
data = tcpCliSock.recv(BUFSIZE).decode('utf-8')
print ("Here's data "+data)
if not data:
break
if data in isoCmd:
if data == "auto":
camera.iso = 0
print ('ISO: Auto')
else:
camera.iso = int(data)
print ('ISO: '+ data)
except Exception as e:
print(e)
Explanation
I'm currently trying to control a smart power strip using a python script. To accomplish this, I'm using a TCP connection with the socket module. Around 75% of the time, I get the response/data I was looking for and everything works perfectly. However, around 25% of the time, the response is cut off at the exact same length, 1024 bytes. This doesn't make any sense to me, as my buffer size is actually set to 2048 bytes. The speed at which I wait in between using recv() doesn't seem to effect/cause this either. Altough TCP is a stream of bytes, is it still possible that this could have to do with packet fragmentation?
Code
Main Code
ip='192.168.0.62'
port=9999
sock_tcp = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock_tcp.connect((ip, port))
sock_tcp.send(encrypt('{"system":{"get_sysinfo":{}}}'))
data = sock_tcp.recv(2048)
sock_tcp.close()
print len(data) #On succesful runs output is 1221, on unsuccesful runs it is 1024
rec = decrypt(data[4:])
print str(rec) #See output below
Encrypt Function
def encrypt(string):
key = 171
result = pack('>I', len(string))
for i in string:
a = key ^ ord(i)
key = a
result += chr(a)
return result
Decrypt Function
def decrypt(string):
key = 171
result = ""
for i in string:
a = key ^ ord(i)
key = ord(i)
result += chr(a)
return result
Output
The string itself that I recieve. It's most likeley not relevant, but I thought I would include it anyway. This is value of the variable rec.
Desired and regular output
Full desired output
{"system":{"get_sysinfo":{"sw_ver":"1.0.6 Build 180627
Rel.081000","hw_ver":"1.0","model":"HS300(US)","deviceId":"80067B24A755F99C4D6C1807455E09F91AB7B2AA","oemId":"5C9E6254BEBAED63B2B6102966D24C17","hwId":"34C41AA028022D0CCEA5E678E8547C54","rssi":-60,"longitude_i":-1222955,"latitude_i":379078,"alias":"TP-LINK_Power
Strip_4F01","mic_type":"IOT.SMARTPLUGSWITCH","feature":"TIM:ENE","mac":"B0:BE:76:12:4F:01","updating":0,"led_off":0,"children":[{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA00","state":0,"alias":"CezHeat","on_time":0,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA01","state":1,"alias":"CezUVB","on_time":191208,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA02","state":1,"alias":"CyanHeat","on_time":191208,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA03","state":1,"alias":"ZanderHeat","on_time":191208,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA04","state":1,"alias":"CairoHeat","on_time":191208,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA05","state":1,"alias":"KodaMister","on_time":191208,"next_action":{"type":-1}}],"child_num":6,"err_code":0}}}
Abnormal and rarer output
Cut off output
{"system":{"get_sysinfo":{"sw_ver":"1.0.6 Build 180627
Rel.081000","hw_ver":"1.0","model":"HS300(US)","deviceId":"80067B24A755F99C4D6C1807455E09F91AB7B2AA","oemId":"5C9E6254BEBAED63B2B6102966D24C17","hwId":"34C41AA028022D0CCEA5E678E8547C54","rssi":-59,"longitude_i":-1222955,"latitude_i":379078,"alias":"TP-LINK_Power
Strip_4F01","mic_type":"IOT.SMARTPLUGSWITCH","feature":"TIM:ENE","mac":"B0:BE:76:12:4F:01","updating":0,"led_off":0,"children":[{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA00","state":0,"alias":"CezHeat","on_time":0,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA01","state":1,"alias":"CezUVB","on_time":191207,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA02","state":1,"alias":"CyanHeat","on_time":191207,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA03","state":1,"alias":"ZanderHeat","on_time":191207,"next_action":{"type":-1}},{"id":"80067B24A755F99C4D6C1807455E09F91AB7B2AA04","state":1,"alias":"CairoHeat","on
Conclusion
If anyone could provide me with a solution or explanation as to why the output/stream gets cut off, it would be much appreciated. I used a lot of the code from this open source module. I'm also looking to understand more of how this all works, so if you could explain a bit more I would really appreciate it.
As per the documentation, the bufsize argument only specifies the maximum amount of data to be read:
socket.recv(bufsize[, flags])
Receive data from the socket. The return
value is a bytes object representing the data received. The maximum
amount of data to be received at once is specified by bufsize. See the
Unix manual page recv(2) for the meaning of the optional argument
flags; it defaults to zero.
To ensure full data transfer a function like this can be used, which waits for the end of the socket connection (indicated by and empty string returned from recv):
def recv_all(connection):
"""
Function for all data
:param connection: socket connection
:return: received data
"""
data = list()
while True:
data.append(connection.recv(2048))
if not data[-1]:
return b''.join(data)
Another example that might fit your application better could be to wait for a fixed message size (1221 as indicated by your question):
def recv_message(connection):
data = list()
transferred_bytes= 0
while transferred_bytes < 1221:
data.append(connection.recv(min(1221-transferred_bytes, 2048)))
if not data[-1]:
raise RuntimeError("socket connection broken")
transferred_bytes += len(data[-1])
return b''.join(data)
This is only a complement to SimonF's answer. The cause of the problem is indeed that TCP is a stream protocol, so packets can be fragmented or re-assembled at any state: sender TCP/IP stack, network equipments, receiver TCP/IP stack - I include the user layer library in the TCP/IP stack here for simplification.
That is the reason why, you should always use a higher level protocol above TCP to be able to split the stream in sensible messages. Here you could note that the end of a message is '}}}', so you could concatenate the input in a buffer until you find that pattern:
def recv_until(c, guard):
"""Receive data from a socket until guard if found on input"""
guard_sz = len(guard) - 1
data = b''
sz = 0
while True:
buffer = c.recv(1024) # read by chuncks of size 1024 (change value to your needs)
got = len(buffer)
data += buffer # concatenate in buffer
ix = data.find(guard, sz - guard_sz if sz > guard_sz else 0) # is guard found?
if ix != -1:
return (data[:ix + guard_sz + 1], # return the message, and what could be behind it
data[ix + guard_sz + 1:])
sz += got
The trick is to considere guard_sz byte from the last chunk, in the case where the guard could be split in two chunks.
Marco, please use recv_into(buffer[, nbytes[, flags]]) method for the socket.
My example for TCP-microserver:
import socket
import struct
def readReliably(s,n):
buf = bytearray(n)
view = memoryview(buf)
sz = 0
while sz < n:
k = s.recv_into(view[sz:],n-sz)
sz += k
# print 'readReliably()',sz
return sz,buf
def writeReliably(s,buf,n):
sz = 0
while sz < n:
k = s.send(buf[sz:],n-sz)
sz += k
# obj = s.makefile(mode='w')
# obj.flush()
# print 'writeReliably()',sz
return sz
# Client
host = "127.0.0.1"
port = 23456
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(10)
s.connect((host,port))
# Request
buf = struct.pack("4B",*[0x01,0x02,0x03,0x04])
io.writeReliably(s,buf,4)
# Response
sz,buf = io.readReliably(s,4)
a = struct.unpack("4B",buf)
print repr(a)
# Server
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
#s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
#s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
#s.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
s.bind((host,port))
s.listen(10) # unaccepted connections
while True:
sk,skfrom = s.accept()
sz,buf = io.readReliably(sk,4)
a = struct.unpack("4B",buf)
print repr(a)
# ...
io.writeReliably(sk,struct.pack("4B",*[0x01,0x02,0x03,0x04]))
I'm trying to capture and send a beacon frame using the following code
def SniffIncomingProbes():
#create a general socket to monitor ongoing traffic
sniffer = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.htons(0x0003))
sniffer.bind((interface, 0x0003))
#byte [30] in the packet is the packet type/subtype field
#\x40 is a probe request, \x80 is a beacon probe
while True:
if frame_subtype==8:
packet = sniffer.recvfrom(2048)[0]
if packet[30] == "\x80":
#byte [67] in the packet contains the length of the SSID
SSID = packet[68: 68 + ord(packet[67])]
MAC = packet[40:46].encode('hex')
association_set.add((MAC,SSID))
PrintNicely()
#try and send a beacon on my own
if len(SSID) == 4:
newPacket = packet[:68] + "MOSS" + packet[72:]
newPacket = newPacket[:46] + ("\xAC\xDC\xDE\xAD\xBE\xEF") + newPacket[52:]
#get the FRC into unsigned form, convert to a
#string, and remove the "0x" characters in the beginning of the string
FCS = str(hex(abs(binascii.crc32(newPacket[:len(packet)-4]))))[2:]
if len(FCS)%2 == 1:
FCS = "0" + FCS
print FCS
print len(FCS)
newPacket = newPacket[:len(newPacket)-4]+ FCS.decode("hex")
sniffer.send(newPacket)
elif frame_subtype==4:
packet = sniffer.recvfrom(2048)[0]
if packet[30] == "\x40":
#byte [55] in the packet contains the length of the SSID
SSID = packet[56: 56 + ord(packet[55])]
MAC = packet[40:46].encode('hex')
association_set.add((MAC,SSID))
PrintNicely()
when I run Wireshark and airodump I can see the packets with SSID "MOSS" going through, and it shows up as a beacon on airodump.
yet when I run Windows Network Monitor on a remote machine, I don't see these packets going through.
also, my CRC checksum seems to be wrong (checked with wireshark).
seems like I am not sending the packet correctly and the FCS check failed
any input will be appreciated,
thank you in advance.
UPDATE:
The frame seqeuence check(FSC) returns Good and is not marked by wireshark anymore, BUT the packet is still not transmitted to any remote machine on the network.
i changed the FSC code to:
def FSCCheckSum(data):
#get the crc32 checksum of the data,
#without the radiotap header(first 30 bytes) and the FSC (last 4 bytes)
#and change it to unsigned form
#convert the hex representation to a string
#and remove the "0x" characters at the beginning of the string
FSC = binascii.crc32(data[30:-4]) % (1<<32)
FSC = str(hex(FSC))[2:]
#we might get zeroes(not showing) from the left,
#so we pad the number from the left with "0"s to match 4 bytes(4 hex pairs)
FSC = "0" * (8-len(FSC)) + FSC
#reverse the byte ordering
return FSC.decode("hex")[::-1]
so I just use the following code to modify the packet. *
Notice I also change the source address now
newPacket = packet[:68] + "MOSS" + packet[72:]
newPacket = newPacket[:40] + ("\xAC\xDC\xDE\xAD\xBE\xEF") + newPacket[46:]
newPacket = newPacket[:46] + ("\xAC\xDC\xDE\xAD\xBE\xEF") + newPacket[52:]
newPacket = newPacket[:-4] + FSCCheckSum(newPacket)
sniffer.send(newPacket)
(i split setting it with the BSSID so it would be easier to read and understand, i know it can be merged)
I am trying to decode an error message from a UDP tracker.
below is my code.
import struct, socket
client_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
info_hash = "%1D%D4%D1%EDQn%DB%5CL%83%90%1B%2B%F8%83%A2%19%C0%7C%98"
peer_id = "-UT1234-m%09%B2%D5%99%FA%1Fj%88%AC%0D%A7"
action =1 # announce
downloaded = 0
left = 0
uploaded = 0
event =0
ip = 0
key = 0
num_want = -1
port = 9999
announce_pack = struct.pack(">QLL20s20sQQQLLLLi",connection_id,action,transaction_id,info_hash,peer_id,down loaded,left,uploaded,event,ip,key,num_want,port)
client_socket.sendto(announce_pack, ("tracker.ccc.de", 80))
res = client_socket.recv(1024)
try:
action = struct.unpack(">HLLLLQQQ20s20sLLH", res[:98])
except Exception as e:
error_action, error_tid, error_message = struct.unpack(">ii8s", res)
raise TrackerRequestException(error_message.decode('utf-16'), "")
i am able to unpack the message but for some reason i am getting error message a
\uc061\u51be\u5841\ud3bf
how do I decode this into proper text?
I got the protocol description from this link http://bittorrent.org/beps/bep_0015.html
There can be an exception for any number of reasons; you could have read too little data for example (socket.recv(1024) can return fewer bytes if that's all that's available at that time).
You need to follow the BEP more closely. You need to check that you have received at least 8 bytes first, and then check for the TID, and the action code. Only if your action code is set to 3 is the response an error message.
The message is not encoded in UTF-16. It should just be ASCII data.