Losing data in received serial string - python

So part of a larger project needs to receive a long hex character string from a serial port using a raspberry pi. I thought I had it all working but then discovered it was losing a chunk of data in the middle of the string.
def BUTTON_Clicked(self, widget, data= None):
ser = serial.Serial("/dev/ex_device", 115200, timeout=3)
RECEIVEDfile = open("RECIEVED.txt", "r+", 0) #unbuffered
#Commands sent out
ser.write("*n\r")
time.sleep(1)
ser.flush()
ser.write("*E")
ser.write("\r")
#Read back string rx'd
RECEIVED= ser.read()
RECEIVED= re.sub(r'[\W_]+', '', RECEIVED) #remove non-alphanumeric characters (caused by noise maybe?)
RECEIVEDfile.write(re.sub("(.{4})", "\\1\n", RECEIVED, 0, re.DOTALL)) #new line every 4 characters
RECEIVEDfile.close
ser.write("*i\r")
ser.close
This is the script used to retrieve the data, the baud rate and serial commands are set right and the script is run as "unbuffered" (-u) but yet the full string is not saved. The string is approx 16384 characters long but only approx 9520 characters (it varies) are being saved (can't supply the string for analysis). Anyone know what I'm missing? Cheers for any help you can give me.

Glad my comment helped!
Set timeout to a low number, e.g. 1 second. Then try something like this. It tries to read a large chunk, but times out quickly and doesn't block for a long time. Whatever has been read is put into a list (rx_buf). Then loop forever, as long as you've got pending bytes to read. The real problem is to 'know' when not to expect any more data.
rx_buf = [ser.read(16384)] # Try reading a large chunk of data, blocking for timeout secs.
while True: # Loop to read remaining data, to end of receive buffer.
pending = ser.inWaiting()
if pending:
rx_buf.append(ser.read(pending)) # Append read chunks to the list.
else:
break
rx_data = ''.join(rx_buf) # Join the chunks, to get a string of serial data.
The reason I'm putting the chunks in a list is that the join operation is much more efficient than '+=' on strings.

According to this question you need to read the data from the in buffer in chunks (here single byte):
out = ''
# Let's wait one second before reading output (let's give device time to answer).
time.sleep(1)
while ser.inWaiting() > 0:
out += ser.read(1)
I suspect what is happening in your case is that you're getting an entire 'buffers' full of data, which depending on the state of the buffer may vary.

Related

How to split serial input into messages

I have 2 serial ports on Raspberry Pi. Currently, the code is reading data from port 1 and writing it on port 2 and vice versa. What I am trying to do is split the input that I am reading from both ports and split it into different messages (group of character) based on a specified character (for example # or !)
Also, how can I modify the current 'for' loop in the end so that I can split the messages for both ports, currently the code is only written to split data from 1 port.
I have already tried split() and it gives a type error. The reason can be serial input might be in a different type
import serial
ser1 = serial.Serial('/dev/ttyUSB0', timeout=2)
ser2 = serial.Serial('/dev/ttyUSB1', timeout=2)
print (ser1)
print (ser2)
ser1_list = []
ser2_list = []
while (True):
data1 = ser1.readlines()
data2 = ser2.readlines()
if data1 or data2:
ser1_list.extend(data1)
ser2.writelines(data1)
byte_split1 = ser1_list.split("1")
ser2_list.extend(data2)
ser1.writelines(data2)
byte_split2 = ser1.split('1')
for x in byte_split1:
print(x)
else:
break
ser1.close()
ser2.close()
Example for the expected result:
If the input is:
abcde#fghi#jklmnop#
it would output:
abcde
fghi
jklmnop
It appears that you're trying to set up something like a chat between the two locations. Please consider looking up how to do that in canonical fashion:
Split this into parallel processes, one each for ser1 => ser2 and the other for ser2 => ser1. Each process will handle communication in its own direction.
This allows you to write one listener for each port; your two processes will be identical, except that you instantiate them with the ports in opposite order. Each listener gathers traffic until it gets to a separator; at that point, it writes the buffer contents up to that point and moves the buffer pointer. There are plenty of I/O packages to do this for you; you're merely "chunking" the stream with that separator character.
That should be enough guidance and references for you to find the examples you need.

How to receive and assemble byte arrays with variable lengths in Python sockets?

I am trying to send large byte arrays of a Protobuf class from a Java client to the Python server. However, they have a variable length, because sometimes I send the bytes of an object from ClassA and sometimes from ClassB.
I have a Python socket server with the following code inside the function that listens to the socket:
byte_array = bytearray()
# receive the data in small chunks and print it
while True:
data = connection.recv(64)
if data:
# output received data
logger.debug("Data: %s" % data)
byte_array.extend(data)
else:
# no more data -- quit the loop
logger.debug("no more data.")
break
logger.info("Generating response...")
connection.send(generate_response(byte_array))
logger.info("Sent response.")
I am assembling the large byte array that I receive by putting together the 64 bytes as I get them.
However, when the byte array is fully transmitted and there is nothing left to send, the server hangs on the connection.recv line.
I read that this is because recv blocks until either it receives something or the connection is closed. However, I do not want to close the connection because I want to send my response back to the client after processing the whole byte array.
I want to know when the byte array I am receiving has been fully transmitted, so that I can avoid this blocking.
I can think of three options:
Set a predefined "end" byte, delimiting the end of the byte array.
Send the size of the byte array beforehand and then instead of while True I have a while bytes_read < expected_bytes cycle.
Set a timeout on the connection and I assume that when a timeout occurs it means everything has already been sent.
I am inclined for the first option, however I do not know what character I should use to end the byte array nor how to read it in my Python code.
Any suggestions?
Thanks.
I would personally go for the second option (combined with a reasonable timeout to cater for evil clients that send only half of the file and hang there forever). Delimiting character is good if you can absolutely guarantee it is unique in your stream (but you still need the timeout).
If you cannot guarantee your delimiter to be unique, sending the size the client needs to expect solves the problem. If your metadata is padded to a fixed length, you do not need to worry about delimiters and detecting them.
Option 1 :
So for the first option you could set end byte which won't occur anywhere in your actual message.
You can create a string for eg."END" and convert it into byte array and send it through your java program. After recieving you could use decode() to convert it to string and compare it. :
Note : The end byte which you will send must be less than or equal to the size of chunk to decode and get the exact end byte.
byte_array = bytearray()
# receive the data in small chunks and print it
while True:
data = connection.recv(64)
command = data.decode()
if command != "END":
# output received data
logger.debug("Data: %s" % data)
byte_array.extend(data)
else:
# no more data -- quit the loop
logger.debug("no more data.")
break
logger.info("Generating response...")
connection.send(generate_response(byte_array))
logger.info("Sent response.")
Option 2 :
For the second option you will need to modify the while loop to run according to metadata. I have considered the metadata will consist of first chunk which will be the number of chunks that will be sent.It could go something like :
byte_array = bytearray()
# receive the data in small chunks and print it
loop_count = 0
count = 1
meta = 1
while loop_count >= count:
data = connection.recv(64)
if(meta):
count = int(data.decode()) # first chunk is the number of chunks that will be sent
meta = 0
logger.debug("Data: %s" % data)
byte_array.extend(data)
loop_count = loop_count + 1
else:
# no more data
logger.debug("no more data.")
logger.info("Generating response...")
connection.send(generate_response(byte_array))
logger.info("Sent response.")
Option 3 :
It will also work fine provided you are sure there will be no network delay and only issue will be your java program will have to wait for the response from the python server untill the timeout takes place
Option 4 :
You could use a non blocking socket which will run untill it dosen't recieve for a pre determined period of time. Although i don't recommend it for your situation you can read about it and see if it suits your needs.

pexpect - how to match a lot of blocks in large output strings

I am using pexpect to login remote devices to query some metrics.
The output of my command is large (about 15k bytes) and I need to extract many metrics from the output. So I use a big pattern string like
"key1:([0-9a-zA-Z]*).*key2:([0-9a-zA-Z]*).*key3......"
to match all the metrics. When the pattern is too long, the expect will block and never terminate.
I've already set the maxread to 20k but it doesn't work. Does anyone have some ideas on how to resolve the issue?
Add The code snippet:
session.sendline(command_info["text"])
ret = session.expect([CMD_PROMPT, pexpect.EOF, pexpect.TIMEOUT])
if ret == 1:
logger.error("EOF error.")
return
if ret == 2:
logger.error("Timeout")
return
result = re.match(getExpectedGroupStr(command_info, logger), session.before, re.S)
If I use a big pattern string instead of CMD_PROMPT, session.expect will hang and never terminate even if the timeout is reached.

Python, serial - changing baudrate, strange behaviour

I am having troubles with changing baudrate while the port is running. All the communication is run at 100k baud, but I also need to send some data at 10k baud. I've read I should use setBaudrate method, so I tried this:
ser = serial.Serial(2, baudrate=BAUD, timeout=TIMEOUT)
def reset(string):
if string:
ser.flushInput() #erase input and output buffers
ser.flushOutput()
ser.setBaudrate(RESET_BAUD) #change baudrate to 10k
ser.write(string)
ser.setBaudrate(BAUD) #go back to 100k
The problem is, it doesn't work right. I don't know what is wrong here, but the string just isn't received properly. But here is interesting part - if I remove the last line (going back to 100k) and run this function from the shell, everything is fine. Then I can just run the last command directly in shell, not inside function.
My question is what exactly happens here and how to avoid it? All I need is a function to send a string with different baudrate and then return to the original baudrate...
You need to wait long enough for the string to be sent before resetting the BAUD rate - otherwise it changes while some of it is still in the serial port (hardware) buffer.
Add time.sleep(0.01*len(string)) before the last line.
BTW try not to use reserved words like string as variable names as it can cause problems.
My guess is that the baud rate is being changed before the data is actually sent. A good bet is to force the data to be sent before trying to change the baud rate.
According to the docs, this is done by calling Serial.flush() (not flushInput() or flushOutput(), as these just discard the buffer contents).

TCP Sockets: Double messages

I'm having a problem with sockets in python.
I have a a TCP server and client that send each other data in a while 1 loop.
It packages up 2 shorts in the struct module (struct.pack("hh", mousex, mousey)). But sometimes when recving the data on the other computer, it seems like 2 messages have been glued together. Is this nagle's algorithm?
What exactly is going on here? Thanks in advance.
I agree with other posters, that "TCP just does that". TCP guarantees that your bytes arrive in the right order, but makes no guarantees about the sizes of the chunks they arrive in. I would add that TCP is also allowed to split a single send into multiple recv's, or even for example to split aabb, ccdd into aab, bcc, dd.
I put together this module for dealing with the relevant issues in python:
http://stromberg.dnsalias.org/~strombrg/bufsock.html
It's under an opensource license and is owned by UCI. It's been tested on CPython 2.x, CPython 3.x, Pypy and Jython.
HTH
To be sure I'd have to see actual code, but it sounds like you are expecting a send of n bytes to show up on the receiver as exactly n bytes all the time, every time.
TCP streams don't work that way. It's a "streaming" protocol, as opposed to a "datagram" (record-oriented) one like UDP or STCP or RDS.
For fixed-data-size protocols (or any where the next chunk size is predictable in advance), you can build your own "datagram-like receiver" on a stream socket by simply recv()ing in a loop until you get exactly n bytes:
def recv_n_bytes(socket, n):
"attempt to receive exactly n bytes; return what we got"
data = []
while True:
have = sum(len(x) for x in data)
if have >= n:
break
want = n - have
got = socket.recv(want)
if got == '':
break
return ''.join(data)
(untested; python 2.x code; not necessarily efficient; etc).
You may not assume that data will become available for reading from the local socket in the same size pieces it was provided for sending at the other source end. As you have seen, this might sometimes be usually true, but by no means reliably so. Rather, what TCP guarantees is that what goes in one end will eventually come out the other, in order without anything missing or if that cannot be achieved by means built into the protocol such as retries, then whole thing will break with an error.
Nagle is one possible cause, but not the only one.

Categories