Converting data from serial/usb using PySerial - python

I have a UBlox receiver connected to my computer and I am trying to read it using PySerial however I am new to python and was hoping to get some clarification/help on understanding the data.
My code looks like:
import serial
# open the connection port
connection = serial.Serial('/dev/ttyACM0', 9600)
# open a file to print the data. I am doing this to make
# sure it is working
file1 = open('output_file', 'wb+')
# All messages from ublox receivers end with a carriage return
# and a newline
msg = connection.readline()
# print the message to the file
print >> file1, msg
What I get in the file, and when I print the 'type' of msg it is a list:
['\xb5b\x01\x064\x00\xe0\x88\x96#\xd3\xb9\xff\xffX\x07\x03\xdd6\xc31\xf6\xfd)\x18\xea\xe6\x8fd\x1d\x00\x01\x00\x00\x00\x00\x00\x00\xfd\xff\xff\xff\x01\x00\x00\x00\x02\x00\x00\x00p\x00\x02\x0f\x16\xa2\x02\x00\x9c\xeb\xb5b\x01\x07\\x00\xe0\x88\x96#\xe0\x07\x01\x17\x15237\x04\x00\x00\x00\xd6\xb9\xff\xff\x03\x01\n']
["\x1a\x0c\x04\x19'y\x00$\xf7\xff\xff\x1a\x1d\x04\x01\x00\x007\x00\x00\x00\x00\x00\x02\x1f\x0c\x01\x00+:\x00\x00\x00\x00\x00\x01 \r\x07&-\x9f\x00\xff\x01\x00\x00\x17\xc1\x0c\x04\x16\n"]
In order to interpret/decode the ublox messages have two format types. Some of the messages are in NMEA format(basically comma delimited)
$MSG, 1, 2, 3, 4
Where the other messages are straight hexidecimal, where each byte or set of bytes represent some information
[AA BB CC DD EE]
So my question is: is there a way I can interpret/convert the data from serial connection to a readable or more usable format so I can actually work with the messages. Like I said, I am new to python and more used to C++ style strings or array of characters
`

A typical parsing task. In this case, it'll probably be the simplest to make tokenization two-stage:
read the data until you run into a message boundary (you didn't give enough info on how to recognize it)
split the read message into its meaningful parts
for type 1, it's likely re.split(", *",text)
for type 2, none needed
display the parts however you want
Regarding why serial.Serial.readline returns a list. I consulted the sources - serial.Serial delegates readline to io.IOBase, and its source indeed shows that it should return a bytestring.
So, the function might be overridden in your code by something. E.g. what do print connection.readline and print serial.Serial.readline show?

Related

Why these Python send / receive socket functions work if invoked slowly, but fail if invoked quickly in a row?

I have a client and a server, where the server needs to send a number of text files to the client.
The send file function receives the socket and the path of the file to send:
CHUNKSIZE = 1_000_000
def send_file(sock, filepath):
with open(filepath, 'rb') as f:
sock.sendall(f'{os.path.getsize(filepath)}'.encode() + b'\r\n')
# Send the file in chunks so large files can be handled.
while True:
data = f.read(CHUNKSIZE)
if not data:
break
sock.send(data)
And the receive file function receives the client socket and the path where to save the incoming file:
CHUNKSIZE = 1_000_000
def receive_file(sock, filepath):
with sock.makefile('rb') as file_socket:
length = int(file_socket.readline())
# Read the data in chunks so it can handle large files.
with open(filepath, 'wb') as f:
while length:
chunk = min(length, CHUNKSIZE)
data = file_socket.read(chunk)
if not data:
break
f.write(data)
length -= len(data)
if length != 0:
print('Invalid download.')
else:
print('Done.')
It works by sending the file size as the first line, then sending the text file line by line.
Both are invoked in loops in the client and the server, so that files are sent and saved one by one.
It works fine if I put a breakpoint and invoke these functions slowly. But If I let the program run uninterrupted, it fails when reading the size of the second file:
File "/home/stark/Work/test/networking.py", line 29, in receive_file
length = int(file_socket.readline())
ValueError: invalid literal for int() with base 10: b'00,1851,-34,-58,782,-11.91,13.87,-99.55,1730,-16,-32,545,-12.12,19.70,-99.55,1564,-8,-10,177,-12.53,24.90,-99.55,1564,-8,-5,88,-12.53,25.99,-99.55,1564,-8,-3,43,-12.53,26.54,-99.55,0,60,0\r\n'
Clearly a lot more data is being received by that length = int(file_socket.readline()) line.
My questions: why is that? Shouldn't that line read only the size given that it's always sent with a trailing \n?
How can I fix this so that multiple files can be sent in a row?
Thanks!
It seems like you're reusing the same connection and what happens is your file_socket being buffered means... you've actually recved more from your socket then you'd think with your read loop.
I.e. the receiver consumes more data from your socket and next time you attempt to readline() you end up reading rest of the previous file up to the new line contained therein or of the next length information.
This also means your initial problem actually is you've skipped a while. Effect of which is next read line is not an int you expected and hence the observed failure.
You can say:
with sock.makefile('rb', buffering=0) as file_socket:
instead to force the file like access being unbuffered. Or actually handle the receiving and buffering and parsing of incoming bytes (understanding where one file ends and the next one begins) on your own (instead of file like wrapper and readline).
You have to understand that socket communication is based on TCP/IP, does not matter if it's same machine (you use loopback in such cases) or different machines. So, you've got some IP addresses between which the connection is established. Going further, it involves accessing your network adapter, ie takes relatively long in comparison to accessing eg. RAM. Additionally, the adapter itself manages when to send particular data frames (lower ISO/OSI layers). Basically, in case of TCP there's ACK required, but on standard PC this is usually not some industrial, real-time ethernet.
So, in your code, you've got a while True loop without any sleep and you don't check what does sock.send returns. Even if something goes wrong with particular data frame, you ignore it and try to send next. On first glance it appears that something has been cached and receiver received what was flushed once connection was re-established.
So, first thing which you should do is check if sock.send indeed returned number of bytes sent. If not, I believe the frame should be re-sent. Another thing which I strongly recommend in such cases is think of some custom protocol (this is usually called application layer in context of OSI/ISO stack). For example, you might have 4 types of frames: START, FILESIZE, DATA, END, assign unique ID and start each frame with the identifier. Then, START is gonna be empty, FILESIZE gonna contain single uint16, DATA is gonna contain {FILE NUMBER, LINE NUMBER, LINE_LENGTH, LINE} and END is gonna be empty. Then, once you've got entire frame on the client, you can safely assemble the information you received.

PwnTools recv() on output that expects input directly after

Hi I have a problem that I cannot seem to find any solution for.
(Maybe i'm just horrible at phrasing searches correctly in english)
I'm trying to execute a binary from python using pwntools and reading its output completely before sending some input myself.
The output from my binary is as follows:
Testmessage1
Testmessage2
Enter input: <binary expects me to input stuff here>
Where I would like to read the first line, the second line and the output part of the third line (with ':' being the last character).
The third line of the output does not contain a newline at the end and expects the user to make an input directly. However, I'm not able to read the output contents that the third line starts with, no matter what I try.
My current way of trying to achieve this:
from pwn import *
io = process("./testbin")
print io.recvline()
print io.recvline()
print io.recvuntil(":", timeout=1) # this get's stuck if I dont use a timeout
...
# maybe sending data here
# io.send(....)
io.close()
Do I missunderstand something about stdin and stdout? Is "Enter input:" of the third line not part of the output that I should be able to receive before making an input?
Thanks in advance
I finally figured it out.
I got the hint I needed from
https://github.com/zachriggle/pwntools-glibc-buffering/blob/master/demo.py
It seems that Ubuntu is doing lots of buffering on its own.
When manually making sure that pwnTools uses a pseudoterminal for stdin and stdout it works!
import * from pwn
pty = process.PTY
p = process(stdin=pty, stdout=pty)
You can use the clean function which is more reliable and which can be used for remote connections: https://docs.pwntools.com/en/dev/tubes.html#pwnlib.tubes.tube.tube.clean
For example:
def start():
p = remote("0.0.0.0", 4000)
return p
io = start()
io.send(b"YYYY")
io.clean()
io.send(b"ZZZ")

Reading python serial data without new line

I am reading serial port data in python. The data does not have newline, so I am receiving data continuously. The data packet has a termination with '|'. I want to read data continuously and print it on newline after '|'
My current data looks like this (highlighted fields are data of my interest in each packet, how shall I extract it and plot )
b'\x01\x063\x011;790.10,3.73,203.45;0.00;28|1;503.36,2.88,129.87;2.00;28|1;1.00,1.60,0.23;4.00;28|1;167.10,1.13,44.98;6.00;28|1;0.07,0.34,2.02;8.00;28|1;100.44,1.04,26.24;10.00;28|1;0.33,0.89,1.65;12.00;28|1;71.72,0.13,19.10;14.00;28|1;0.07,0.41,1.76;16.00;28|1;55.70,0.08,14.89;18.00;28|1;0.19,0.61,2.07;20.00;28|1;45.84,0.46,11.70;22.00;28|1;0.07,0.44,1.76;24.00;28|1;38.87,0.53,9.90;26.00;28|1;0.12,0.11,1.62;28.00;28|1;33.65,0.26,8.57;30.00;28|1;0.07,0.11,1.58;32.00;28|1;29.80,0.36,7.51;34.00;28|1;0.09,0.37,1.48;36.00;28|1;26.65,0.17,6.80;38.00;28|1;0.07,0.28,1.43;40.00;28|1;24.07,0.11,6.32;42.00;28|1;0.06,0.14,1.66;44.00;28|1;22.11,0.09,5.65;46.00;28|1;0.07,0.15,1.66;48.00;28|1;20.41,0.22,5.13;50.00;28|1;0.05,0.08,1.61;52.00;28|1;18.93,0.05,4.80;54.00;28|1;0.06,0.12,1.77;56.00;28|1;17.74,0.14,4.24;58.00;28|1;0.03,0.04,1.57;60.00;28|1;16.61,0.06,4.03;62.00;28|1;0.06,0.07,1.55;64.00;28|1;15.59,0.14,3.86;66.00;28|1;0.02,0.11,1.68;68.00;28|1;14.78,0.12,3.49;70.00;28|1;0.06,0.18,1.57;72.00;28|1;14.03,0.05,3.39;74.00;28|1;0.01,0.09,1.67;76.00;28|1;13.35,0.04,3.15;78.00;28|1;0.05,0.14,1.72;80.00;28|1;12.81,0.18,2.85;82.00;28|1;0.00,0.10,1.60;84.00;28|1;12.26,0.16,2.75;86.00;28|1;0.05,0.07,1.61;88.00;28|1;11.73,0.08,2.58;90.00;28|1;0.01,0.08,1.58;92.00;28|1;11.31,0.10,2.46;94.00;28|1;0.04,0.16,1.54;96.00;28|1;10.87,0.07,2.40;98.00;28|1;0.01,0.08,1.57;100.00;28|1;10.48,0.06,2.32;102.00;28|1;0.04,0.06,1.66;104.00;28|1;10.19,0.06,2.15;106.00;28|1;0.02,0.03,1.64;108.00;28|1;9.87,0.09,2.05;110.00;28|1;0.03,0.07,1.65;112.00;28|1;9.57,0.02,1.94;114.00;28|1;0.02,0.04,1.69;116.00;28|1;9.33,0.09,1.75;118.00;28|1;0.03,0.04,1.57;120.00;28|1;9.05,0.03,1.75;122.00;28|1;0.02,0.02,1.61;124.00;28|1;8.79,0.08,1.68;126.00;28|1;0.02,0.10,1.65;128.00;28|1;8.61,0.05,1.55;130.00;28|1;0.03,0.11,1.58;132.00;28|1;8.40,0.06,1.56;134.00;28|1;0.02,0.04,1.67;136.00;28|1;8.22,0.05,1.42;138.00;28|1;0.03,0.07,1.65;140.00;28|1;8.08,0.13,1.31;142.00;28|1;0.01,0.07,1.59;144.00;28|1;7.90,0.08,1.30;146.00;28|1;0.03,0.06,1.62;148.00;28|1;7.73,0.03,1.20;150.00;28|1;0.01,0.07,1.58;152.00;28|1;7.60,0.04,1.17;154.00;28|1;0.03,0.10,1.57;156.00;28|1;7.45,0.07,1.16;158.00;28|1;0.01,0.02,1.62;160.00;28|1;7.33,0.04,1.10;162.00;28|1;0.03,0.04,1.65;164.00;28|1;7.26,0.05,1.01;166.00;28|1;0.00,0.01,1.64;168.00;28|1;7.14,0.04,0.96;170.00;28|1;0.03,0.05,1.66;172.00;28|1;7.03,0.05,0.88;174.00;28|1;0.00,0.02,1.65;176.00;28|1;6.95,0.07,0.78;178.00;28|1;0.03,0.05,1.57;180.00;28|1;6.84,0.02,0.81;182.00;28|1;0.01,0.03,1.62;184.00;28|1;6.75,0.04,0.74;186.00;28|1;0.03,0.09,1.62;188.00;28|1;6.71,0.06,0.68;190.00;28|1;0.01,0.07,1.59;192.00;28|1;6.64,0.06,0.70;194.00;28|1;0.02,0.02,1.67;196.00;28|1;6.58,0.06,0.57;198.00;28|1;0.01,0.04,1.62;200.00;28|1;6.54,0.10,0.53;202.00;28|1;0.02,0.08,1.59;204.00;28|1;6.46,0.04,0.52;206.00;28|1;0.02,0.06,1.62;208.00;28|1;6.40,0.01,0.45;210.00;28|1;0.02,0.06,1.58;212.00;28|1;6.37,0.03,0.45;214.00;28|1;0.02,0.06,1.60;216.00;28|1;6.32,0.07,0.43;218.00;28|1;0.02,0.01,1.64;220.00;28|1;6.29,0.02,0.36;222.00;28|1;0.02,0.03,1.65;224.00;28|1;6.29,0.05,0.31;226.00;28|1;0.01,0.02,1.64;228.00;28|1;6.24,0.01,0.26;230.00;28|1;0.02,0.04,1.66;232.00;28|1;6.21,0.06,0.18;234.00;28|1;0.01,0.03,1.62;236.00;28|1;6.20,0.05,0.14;238.00;28|1;0.02,0.06,1.57;240.00;28|1;6.16,0.01,0.16;242.00;28|1;0.01,0.04,1.62;244.00;28|1;6.16,0.02,0.09;246.00;28|1;0.02,0.09,1.60;248.00;28|1;6.18,0.07,0.09;250.00;28|1;0.00,0.04,1.60;252.00;28|1;6.17,0.06,0.08;254.00;28|DATAEND|
I am currenlty reading 3480 bytes. but want to read data continuously
import serial
ser = serial.Serial('/dev/ttyUSB0', 115200, serial.EIGHTBITS, serial.PARITY_NONE, serial.STOPBITS_ONE)
buff = list()
values = bytearray([1,6,51,1]) # serial port command to read serial data
#print(type(values))
ser.write(values)
while True:
s = ser.read(3480)
print(s)
I want to separate all fields and print the data that is highlighted
I would recommend the use of read_until which allows you to define the character which works like a newline (LF is the default value). The call signature is read_until(expected=LF, size=None). For details consult https://pyserial.readthedocs.io/en/latest/pyserial_api.html . Alternatively you could read everything which is available and use a regular expression to find the single packages and extract the fields. This can be done by re.find_iter.

Python Socket is receiving inconsistent messages from Server

So I am very new to networking and I was using the Python Socket library to connect to a server that is transmitting a stream of location data.
Here is the code used.
import socket
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((gump.gatech.edu, 756))
try:
while (1):
data = s.recv(BUFFER_SIZE).decode('utf-8')
print(data)
except KeyboardInterrupt:
s.close()
The issue is that the data arrives in inconsistent forms.
Most of the times it arrives in the correct form like this:
2016-01-21 22:40:07,441,-84.404153,33.778685,5,3
Yet other times it can arrive split up into two lines like so:
2016-01-21
22:40:07,404,-84.396004,33.778085,0,0
The interesting thing is that when I establish a raw connection to the server using Putty I only get the correct form and never the split. So I imagine that there must be something happening that is splitting the message. Or something Putty is doing to always assemble it correctly.
What I need is for the variable data to contain the proper line always. Any idea how to accomplish this?
It is best to think of a socket as a continuous stream of data, that may arrive in dribs and drabs, or a flood.
In particular, it is the receivers job to break the data up into the "records" that it should consist of, the socket does not magically know how to do this for you. Here the records are lines, so you must read the data and split into lines yourself.
You cannot guarantee that a single recv will be a single full line. It could be:
just part of a line;
or several lines;
or, most probably, several lines and another part line.
Try something like: (untested)
# we'll use this to collate partial data
data = ""
while 1:
# receive the next batch of data
data += s.recv(BUFFER_SIZE).decode('utf-8')
# split the data into lines
lines = data.splitlines(keepends=True)
# the last of these may be a part line
full_lines, last_line = lines[:-1], lines[-1]
# print (or do something else!) with the full lines
for l in full_lines:
print(l, end="")
# was the last line received a full line, or just half a line?
if last_line.endswith("\n"):
# print it (or do something else!)
print(last_line, end="")
# and reset our partial data to nothing
data = ""
else:
# reset our partial data to this part line
data = last_line
The easiest way to fix your code is to print the received data without adding a new line, which the print statement (Python 2) and the print() function (Python 3) do by default. Like this:
Python 2:
print data,
Python 3:
print(data, end='')
Now print will not add its own new line character to the end of each printed value and only the new lines present in the received data will be printed. The result is that each line is printed without being split based on the amount of data received by each `socket.recv(). For example:
from __future__ import print_function
import socket
s = socket.socket()
s.connect(('gump.gatech.edu', 756))
while True:
data = s.recv(3).decode('utf8')
if not data:
break # socket closed, all data read
print(data, end='')
Here I have used a very small buffer size of 3 which helps to highlight the problem.
Note that this only fixes the problem from the POV of printing the data. If you wanted to process the data line-by-line then you would need to do your own buffering of the incoming data, and process the line when you receive a new line or the socket is closed.
Edit:
socket.recv() is blocking and like the others said, you wont get an exact line each time you call the method. So as a result, the socket is waiting for data, gets what it can get and then returns. When you print this, because of pythons default end argument, you may get more newlines than you expected. So to get the raw stuff from your server, use this:
import socket
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('gump.gatech.edu', 756))
try:
while (1):
data=s.recv(BUFFER_SIZE).decode('utf-8')
if not data: break
print(data, end="")
except KeyboardInterrupt:
s.close()

reading from sys.stdin without newline or EOF

I want to recieve data from my gps-tracker. It sends data by tcp, so I use xinetd to listen some tcp port and python script to handle data. This is xinetd config:
service gps-gprs
{
disable = no
flags = REUSE
socket_type = stream
protocol = tcp
port = 57003
user = root
wait = no
server = /path/to/gps.py
server_args = 3
}
Config in /etc/services
gps-gprs 57003/tcp # Tracking system
And Python script gps.py
#!/usr/bin/python
import sys
def main():
data = sys.stdin.readline().strip()
#do something with data
print 'ok'
if __name__ =='__main__':
main()
The tracker sends data strings in raw text like
$GPRMC,132017.000,A,8251.5039,N,01040.0065,E,0.00,,010111,0,,A*75+79161234567#
The problem is that sys.stdin in python script doesn't recieve end of line or end of file character and sys.stdin.readline() goes forever. I tried to send data from another pc with a python script
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('', 57003))
s.sendall( u'hello' )
data = s.recv(4024)
s.close()
print 'Received', data
and if the message is 'hello', it fails, but if the message is 'hello\n', it's ok and everything is fine. But I don't know ho to tell tracker or xinetd to add this '\n' at the end of messages. How can I read the data from sys.stdin without EOF or EOL in it?
Simple:
data=sys.stdin.read().splitlines()
for i in data:
print i
No newlines
sys.stdin.readline() waits forever until it receives a newline. Then it considers the current line to be complete and returns it in full. If you want to read data that doesn't contain newlines or you don't want to wait until a newline is received before you process (some of) the data, then you're going to have to use something other than readline. Most likely you should call read, which reads arbitrary data up to a given size.
However, your GPS appears to be sending data in the well-known NEMA format, and that format certainly terminates each line with a newline. Actually, it probably terminates each line with CRLF (\r\n) but it is possible that the \r could be getting munged somewhere before it gets to your TCP socket. Either way there's a \n at the very end of each line.
If your readline call is hanging without returning any lines, most likely it's because the sender is buffering lines until it has a full buffer. If you waited long enough for the sender's buffer to fill up, you'd get a whole bunch of lines at once. If that's what's happening, you'll have to change the sender to that it flushes its send buffer after each NEMA sentence.
It seems you are receiving # instead of <CR><LF>, just read until the # sign.
data = ""
while len(data) == 0 or data[-1] <> '#':
data += sys.stdin.read(1)
#do something with data
print 'ok'
My solution :
var = sys.stdin.readline().replace('\n', '')
It :
find the newline in the entry,
replace it from the entry by '' (none) ~remove,
assigne it to variable.

Categories