PySerial skips/loses data on serial acquisition - python

I have a data acquisition system that produces ASCII data. The data is acquired over USB with serial communication protocol (virtual serial, as the manufacturer of the box claims). I have a Python program/script that uses PySerial with a PySide GUI that plots the acquired data and saves it to HDF5 files. I'm having a very weird problem and I don't know how to tackle it. I wish you guys could help me and provide advice on how you would debug this problem.
How the problem shows up: The problem is that if I use a software like Eltima Data Logger, the data acquired looks fine. However, if I use my software (with PySerial), some chunks of the data seems to be missing. What's weird is that the missing chunk of the data is incompatible with the method of reading. I read line by line, and what is missing from the data is like 100 bytes or 64 bytes chunks that sometimes include newlines!!! I know what's missing because the device buffers the data on an SD Card before sending it to the computer. This made me believe for a long time that the hardware has a problem, until I used this software, Eltima, that showed that it's acquiring the data fine.
The following is the configuration of Eltima:
My configuration:
This whole thing is running in a QThread.
The following is the methods I use in my code (with some minor polishing to make it reusable here):
self.obj = serial.Serial()
self.obj.port = instrumentName
self.obj.baudrate = 115200
self.obj.bytesize = serial.EIGHTBITS
self.obj.parity = serial.PARITY_ODD
self.obj.stopbits = serial.STOPBITS_ONE
self.obj.timeout = 1
self.obj.xonxoff = False
self.obj.rtscts = False
self.obj.dsrdtr = False
self.obj.writeTimeout = 2
self.obj.open()
The algorithm I use for reading, is that I have a loop that looks for a specific header line, and once found, it keeps pushing lines into buffer until a specific end line is found; and this data is finally processed. Following is my code:
try:
# keep reading until a header line is found that indicates the beginning of a batch of data
while not self.stopped:
self.line = self.readLine()
self.writeDumpFileLine(self.line)
if self.line == DataBatch.d_startString:
print("Acquiring batch, line by line...")
self.dataStrQueue.append(self.line)
break
# after the header line, keep reading until a specific string is found
while not self.stopped:
self.line = self.readLine()
self.writeDumpFileLine(self.line)
self.dataStrQueue.append(self.line)
if self.line == DataBatch.d_endString:
break
except Exception as e1:
print("Exception while trying to read. Error: " + str(e1))
The self.writeDumpFileLine() takes the line from the device and dumps it in a file directly before processing for debugging purposes. These dump files have confirmed the problem of missing chunks.
The implementation of self.readLine() is quite simple:
def readLine(self):
lineData = decodeString(self.obj.readline())
lineData = lineData.replace(acquisitionEndlineChar, "")
return lineData
I would like to point out that I also have an implementation that pulls thousands of lines and parses them based on inWaiting(), and this method has the same problem too!
Now I'm starting to wonder: Is it PySerial? What else could be causing this problem?
Thank you so much for any efforts. If you require any additional information, please ask!
UPDATE:
Actually I have just confirmed that the problem can be reproduced by getting the system to lag a little bit. I use PyCharm to program this software, and while the program is running, if I press Ctrl+S to save, the GUI of PyCharm freezes a little bit (and hence its terminal). Repeating this many times causes the problem in a reproducible manner!!!!

Related

How to do non blocking usb serial input in circuit python?

I am trying to interface a micropython board with python on my computer using serial read and write, however I can't find a way to read usb serial data in micropython that is non-blocking.
Basicly I want to call the input function without requiring an input to move on. (something like https://github.com/adafruit/circuitpython/pull/1186 but for usb)
I have tried using tasko, uselect (it failed to import the library and I can't find a download), and await functions. I'm not sure it there is a way to do this, but any help would be appreciated.
For CircuitPython there's supervisor.runtime.serial_bytes_available which may provide the building block for what you want to do.
This is discussed on Adafruit Forums: Receive commands from the computer via USB.
based on the relative new added usb_cdc buildin (>= 7.0.0) you can do something like this:
def read_serial(serial):
available = serial.in_waiting
while available:
raw = serial.read(available)
text = raw.decode("utf-8")
available = serial.in_waiting
return text
# main
buffer = ""
serial = usb_cdc.console
while True:
buffer += read_serial(serial)
if buffer.endswith("\n"):
# strip line end
input_line = buffer[:-1]
# clear buffer
buffer = ""
# handle input
handle_your_input(input_line)
do_your_other_stuff()
this is a very simple example.
the line-end handling could get very complex if you want to support universal line ends and multiple commands send at once...
i have created a library based on this: CircuitPython_nonblocking_serialinput

Windows media foundation decoding audio streams

I am new to Windows Media Foundation, I am currently following some tutorials on the basics to get started and decoding audio. However, I am running int a few issues. Using: Windows 10 64bit (1809), and am using Python (ctypes and COM's to interface).
1) The IMFSourceReader will not allow me to select or deselect any stream out. I have tried wav and mp3 formats (and multiple different files), but they all error out. According to the docs, to speed up performance you want to deselect other streams and select the stream you want, in this case, audio.
However:
source_reader.SetStreamSelection(MF_SOURCE_READER_ANY_STREAM, False)
Produces an error:
OSError: [WinError -1072875853] The stream number provided was invalid.
Which should be correct since MF_SOURCE_READER_ANY_STREAM value (DWORD of 4294967294) should be universal? Or am I incorrect in that?
I've tried seeing if I could just select the audio stream:
source_reader.SetStreamSelection(MF_SOURCE_READER_FIRST_AUDIO_STREAM, True)
Which produces a different error:
OSError: exception: access violation reading 0x0000000000000001
My current code up until that point:
MFStartup(MF_VERSION) # initialize
source_reader = IMFSourceReader()
filename = "C:\\test.mp3"
MFCreateSourceReaderFromURL(filename, None, ctypes.byref(source_reader)) # out: source reader.
if source_reader: # Not null
source_reader.SetStreamSelection(MF_SOURCE_READER_ANY_STREAM, False) # invalid stream #?
source_reader.SetStreamSelection(MF_SOURCE_READER_FIRST_AUDIO_STREAM, True) # access violation??
IMFSourceReader seems to be functioning just fine for other functions, Such as GetCurrentMediaType, SetCurrentMediaType, etc. Could it still return IMFSourceReader if there are any issues?
2) I am not sure if not being able to select the streams is causing further issues (I suspect it is). If I just skip selecting or deselecting streams, everything actually works up until trying to convert a sample into a single buffer with ConvertToContiguousBuffer, which, according to the docs, outputs into a IMFMediaBuffer. The problem is, after running that, it does return as S_OK, but the buffer is null. I used GetBufferCount to make sure there are some buffers in the sample atleast, and it always returns 1-3 depending on the file used, so it shouldn't be empty.
Here is the relevant code:
while True:
flags = DWORD()
sample = IMFSample()
source_reader.ReadSample(streamIndex, 0, None, ctypes.byref(flags), None, ctypes.byref(sample)) # flags, sample [out]
if flags.value & MF_SOURCE_READERF_ENDOFSTREAM:
print("READ ALL OF STREAM")
break
if sample:
buffer_count = DWORD()
sample.GetBufferCount(ctypes.byref(buffer_count))
print("BUFFER COUNT IN SAMPLE", buffer_count.value)
else:
print("NO SAMPLE")
continue
buffer = IMFMediaBuffer()
hr = sample.ConvertToContiguousBuffer(ctypes.byref(buffer))
print("Conversion succeeded", hr == 0) # true
if buffer:
print("CREATED BUFFER")
else:
print("BUFFER IS NULL")
break
I am unsure where to go from here, I couldn't find much explanations on the internet regarding these specific issues. Is WMF still the goto for Windows 10? Should I be using something else? I really am stumped and any help is greatly appreciated.
Try using MF_SOURCE_READER_ALL_STREAMS instead of MF_SOURCE_READER_ANY_STREAM.
source_reader.SetStreamSelection(MF_SOURCE_READER_ALL_STREAMS, False)
When reading the samples you also need to specify a valid stream index which in your case I suspect 0 isn't. Try:
source_reader.ReadSample((DWORD)MF_SOURCE_READER_FIRST_AUDIO_STREAM, 0, None, ctypes.byref(flags), None, ctypes.byref(sample))
Also does your Python wrapper return a result when you make your Media Foundation calls? Nearly all Media Foundation methods return a HRESULT and it's important to check it equals S_OK before proceeding. If you don't it's very hard to work out where a wrong call occurred.
Is WMF still the goto for Windows 10?
Many people have asked that question. The answer depends on what you need to do (in your case audio codec support is very limited so perhaps it isn't the best option). But for things like rendering audio/video, reading/writing to media files, audio/video device capture etc. it is still the case that Media Foundation is the most up to date Microsoft supported option.
Usually, with MediaFoundation, you need to call CoInitializeEx before MFStartup :
Media Foundation and COM
Best Practices for Applications
In Media Foundation, asynchronous processing and callbacks are handled by work queues. Work queues always have multithreaded apartment (MTA) threads, so an application will have a simpler implementation if it runs on an MTA thread as well. Therefore, it is recommended to call CoInitializeEx with the COINIT_MULTITHREADED flag.
MFCreateSourceReaderFromURL function
Remarks
Call CoInitialize(Ex) and MFStartup before calling this function.
In your code, I don't see the call to CoInitializeEx.
EDIT
Because you are using audio file, normally you should have only one audio stream, and index should be 0.
Try :
source_reader.SetStreamSelection(0, True)
and tell us the result.

Unexpected behaviour of read(n) in pySerial 3.3

I'm writing a project using an STM32F407_VG board that uses an RS232 connection to send batch of data of different sizes (~400 bytes) on a serial port, and those data must be written on file. On desktop side I'm using a python 3 script with pyserial 3.3.
I've tried reading a single byte at time with ser.read() but I think it's too slow because I'm losing some of the data. So I'm trying to send the size of the batch as an integer before the batch itself, in order to reduce the overhead, and write data to file during time interval within a batch and the following one.
PROBLEM IS: ser.read(n) behave in a very strange way, and 99% of the times it blocks when it's time to read the batch and do not return. It also happens that sometimes it can read the first batch and writes it to file successfully, but it blocks at the second loop iteration. It's strange because I can use ser.read(4) to get the batch size with zero problem, and I use ser.readline() at the beginning of the script when listening to a starting signal, but I cannot read the data.
I'm sure that data are there and are well formed because I checked with a logic analyzer, and I've already tried with enabling and disabling flow control or set different baud rates on both the board and the script. I think it could be a config problem of python, but actually I've run out of ideas.
PYTHON SCRIPT CODE -SNIPPET
ser = serial.Serial(str(sys.argv[1]), \
int(sys.argv[2]), \
stopbits=serial.STOPBITS_ONE, \
parity=serial.PARITY_NONE, \
bytesize=serial.EIGHTBITS, \
timeout=None \
)
outputFile = open(sys.argv[3],"wb")
# wait for begin string
beginSignal = "ready"
word = ""
while word != beginSignal:
word = ser.readline().decode()
word = sample.split("\n")[0]
print("Started receiving...")
while True:
# read size of next batch
nextBatchSize = ser.read(4)
nextBatchSize = int.from_bytes(nextBatchSize,byteorder='little', signed=True)
# reads the batch:
# THIS IS THE ONE THAT CREATES PROBLEMS
batch = ser.read(nextBatchSize)
# write data to file
outputFile.write(batch)
BOARD CODE - SNIPPET
// this function sends the size of the batch and the batch itself
void sendToSerial(unsigned char* mp3data, int size){
// send actual size of the batch
write(STDOUT_FILENO,&size,sizeof(int));
// send the batch of data
write(STDOUT_FILENO,mp3data,size);
}
Any idea? Thanks!
You might have already tried this, but print out nextBatchSize to confirm that it is what you expect, just in case the byte order is reversed. If this is wrong your Python code could be trying to read too many bytes, and would therefore block.
Also you can check ser.in_waiting to see how many bytes are available to be read from the input buffer before your read attempt:
print(ser.in_waiting)
batch = ser.read(nextBatchSize)
You should also check the return value of write() in your C code, which is the number of bytes actually written, or -1 if there is an error. write() is not guaranteed to write all of the bytes in the given buffer, so there might remain some that have not been written. Or there might have been an error.
Data loss suggests a flow control issue. You need to ensure that it is enabled at both ends. You've said that you tried it, but in your posted code it is not enabled. How are you opening and configuring the serial port at the board end?

Python LZMA : Compressed data ended before the end-of-stream marker was reached

I am using the built in lzma python to decode compressed chunk of data. Depending on the chunk of data, I get the following exception :
Compressed data ended before the end-of-stream marker was reached
The data is NOT corrupted. It can be decompressed correctly with other tools, so it must be a bug in the library. There are other people experiencing the same issue:
http://bugs.python.org/issue21872
https://github.com/peterjc/backports.lzma/issues/6
Downloading large file in python error: Compressed file ended before the end-of-stream marker was reached
Unfortunately, none seems to have found a solution yet. At least, one that works on Python 3.5.
How can I solve this problem? Is there any work around?
I spent a lot of time trying to understand and solve this problem, so i thought it would a good idea to share it. The problem seems to be caused by the a chunk of data without the EOF byte properly set. In order to decompress a buffer, I used to use the lzma.decompress provided by the lzma python lib. However, this method expects each data buffer to contains a EOF bytes, otherwise it throws a LZMAError exception.
To work around this limitation, we can implement an alternative decompress function which uses LZMADecompress object to extract the data from a buffer. For example:
def decompress_lzma(data):
results = []
while True:
decomp = LZMADecompressor(FORMAT_AUTO, None, None)
try:
res = decomp.decompress(data)
except LZMAError:
if results:
break # Leftover data is not a valid LZMA/XZ stream; ignore it.
else:
raise # Error on the first iteration; bail out.
results.append(res)
data = decomp.unused_data
if not data:
break
if not decomp.eof:
raise LZMAError("Compressed data ended before the end-of-stream marker was reached")
return b"".join(results)
This function is similar to the one provided by the standard lzma lib with one key difference. The loop is broken if the entire buffer has been processed, before checking if we reached the EOF mark.
I hope this can be useful to other people.

Reading and Writing to a file simultaneously

I have a module in written in System Verilog that dumps the contents of the SRAM into a file. I would like to read from this file and use the data in a separate program written in python, but in real time. I don't have much control over the writing from the verilog code. Is it possible to somehow manage the two read and writes? Currently when it reads from the file, there is a (seemingly) random number inserted at the start of every line and that throws off the parsing. I assume these prefixes only appear when they are reading and writing at the same time because if I run them both very slowly it works fine.
window = Tk()
canvas = Canvas(window, width=WIDTH, height=HEIGHT, bg="#000000")
canvas.pack()
img = PhotoImage(width=WIDTH, height=HEIGHT)
canvas.create_image((WIDTH/2, HEIGHT/2), image=img, state="normal")
def redraw():
fp = open('test_data.txt','r')
lines=fp.readlines()
for i in range(len(lines)):
#do stuff
fp.close()
window.after(35,redraw)
window.after(35,redraw)
mainloop()
This is what is reading.
Any suggestions are appreciated.
Reading and writing a file from multiple processes is likely to be unpredictable.
If you are running on a Unix-like system, you could use mkfifo to make a file-like object which you can write to and read from simultaneously and the data will stay in the correct order.
On Windows you need a NamedPipe - which you could create from Python and then connect to by opening as a normal file in SystemVerilog (I believe!)
http://docs.activestate.com/activepython/2.4/pywin32/win32pipe.html
I would suggest using VPI to directly access the contents of the SRAM straight from the simulation. This also opens the possibility of dynamically adjusting your stimulus (e.g. sending data until a FIFO is full) rather than relying on files for input/output.
Since you're using Python you could look into Cocotb, an open-source Python cosimulation framework. Basically you can use the python 'dot' notation to traverse the design hierarchy and pull out values:
# Pull out the values from the simulation
for index in range(len(dut.path.through.hierarchy.ram)):
val = dut.path.through.hierarchy.ram[index].value.integer
# do stuff
I created a quick example on EDA Playground with a simplistic example: http://www.edaplayground.com/s/57/565
Disclaimer: I'm one of the Cocotb developers.
You can use pipe, in this example a cmd (windows) command line writes in the pipe, then the program shows its output from the same pipe:
import subprocess,sys
p = subprocess.Popen("netsatat",shell=False ,stdout=subprocess.PIPE)
while True:
out = p.stdout.readline()
if out == '' and p.poll() != None:
break
if out != b'':
print(out.decode('ascii','backslashreplace'))
else :
break;

Categories