Python serial communication - python

I'm working on an Arduino project, and I am interfacing it with a Python script due to memory limitations. On the Python side I have a 2 dimensional matrix containing respective x, y values for coordinates, and in this list is 26000 coordinate pairs. So, in interest of clarifying the data structure for all of you, pathlist[0][0], would return the X value of the first coordinate of my list. Performing different operations, etc. on this list in Python is posing no problems. Where I am running into trouble however is sending these values to Arduino over serial, in a way that is useful.
Due to the nature of serial communication (at least I think this is the case) I must send each each integer as a string, and only one digit at a time. So, a number like 345 would be sent over as 3 individual characters, those being of course, 3, 4, then 5.
What I am struggling with is finding a way to rebuild those integers on the Arduino.
Whenever I send a value over, it's receiving the data and outputting it like so:
//Python is sending over the number '25'
2ÿÿ52
//Python is sending the number 431.
4ÿÿ321ÿÿÿ2
The Arduino code is:
String str;
int ds = 4;
void setup() {
Serial.begin(9600);
}
void loop(){
if (Serial.available()>0) {
for (int i=0; i<4; i=i+1) {
char d= Serial.read();
str.concat(d);
}
char t[str.length()+1];
str.toCharArray(t, (sizeof(t)));
int intdata = atoi(t);
Serial.print(intdata);
}
}
And the Python code looks like this:
import serial
s = serial.Serial(port='/dev/tty.usbmodemfd131', baudrate=9600)
s.write(str(25))
I'm almost certain that the problem isn't stemming from the output method (Serial.print), seeing as when I declare another int, it formats fine on output, so I am assuming the problem lies in how the intdata variable is constructed.
One thing of note that may help diagnose this problem is that if I change Serial.print(intdata) to Serial.print(intdata+5) my result is 2ÿÿ57, where I would expect 30 (25+5). This 7 is present regardless of the input. For instance I could write 271 to the serial and my result would look as follows:
//For input 271.
2ÿÿ771ÿÿÿ7
It appears to me that Arduino is chunking the values into pairs of two and appending the length to the end. I can't understand why that would happen though.
It also seems to me that the ÿ are being added in the for loop. Meaning that they are added because nothing is being sent at that current moment. But even fixing that by adding yet another if(Serial.available()>0) conditional, the result is still not treated like an integer.
Also, would using Pickle be appropriate here?
What am I doing wrong?

You should wait a bit for the serial data to arrive.
The Arduino code should be:
if (Serial.available()){
delay(100); // Wait for all data.
while (Serial.available()) {
char d = Serial.read();
str.concat(d);
}
}
Also you have to clear your string before re-using it.
[Edit]
I forgot to mention ÿ == -1 == 255 which means Serial.read() it is saying it can't read anything.

I would change the communication so python sends newlines between numbers, so you're not as dependent on the timing:
s.write(str(25)+'\n')
and then on the receiving side:
void loop(){
while (Serial.available() > 0) {
char d = Serial.read();
if (d == '\n') {
char t[str.length()+1];
str.toCharArray(t, (sizeof(t)));
int intdata = atoi(t);
Serial.print(intdata);
str = String();
}
else {
str.concat(d);
}
}
}

Related

Fastest way to store and retrieve a large stream of small unstructured messages

I am developing an IOT application that requires me to handle many small unstructured messages (meaning that their fields can change over time - some can appear and others can disappear). These messages typically have between 2 and 15 fields, whose values belong to basic data types (ints/longs, strings, booleans). These messages fit very well within the JSON data format (or msgpack).
It is critical that the messages get processed in their order of arrival (understand: they need to be processed by a single thread - there is no way to parallelize this part). I have my own logic for handling these messages in realtime (the throughput is relatively small, a few hundred thousand messages per second at most), but there is an increasing need for the engine to be able to simulate/replay previous periods by replaying a history of messages. Though it wasn't initially written for that purpose, my event processing engine (written in Go) could very well handle dozens (maybe in the low hundreds) of millions of messages per second if I was able to feed it with historical data at a sufficient speed.
This is exactly the problem. I have been storing many (hundreds of billions) of these messages over a long period of time (several years), for now in delimited msgpack format (https://github.com/msgpack/msgpack-python#streaming-unpacking). In this setting and others (see below), I was able to benchmark peak parsing speeds of ~2M messages/second (on a 2019 Macbook Pro, parsing only), which is far from saturating disk IO.
Even without talking about IO, doing the following:
import json
message = {
'meta1': "measurement",
'location': "NYC",
'time': "20200101",
'value1': 1.0,
'value2': 2.0,
'value3': 3.0,
'value4': 4.0
}
json_message = json.dumps(message)
%%timeit
json.loads(json_message)
gives me a parsing time of 3 microseconds/message, that is slightly above 300k messages/second. Comparing with ujson, rapidjson and orjson instead of the standard library's json module, I was able to get peak speeds of 1 microsecond/message (with ujson), that is about 1M messages/second.
Msgpack is slightly better:
import msgpack
message = {
'meta1': "measurement",
'location': "NYC",
'time': "20200101",
'value1': 1.0,
'value2': 2.0,
'value3': 3.0,
'value4': 4.0
}
msgpack_message = msgpack.packb(message)
%%timeit
msgpack.unpackb(msgpack_message)
Gives me a processing time of ~750ns/message (about 100ns/field), that is about 1.3M messages/second. I initially thought that C++ could be much faster. Here's an example using nlohmann/json, though this is not directly comparable with msgpack:
#include <iostream>
#include "json.hpp"
using json = nlohmann::json;
const std::string message = "{\"value\": \"hello\"}";
int main() {
auto jsonMessage = json::parse(message);
for(size_t i=0; i<1000000; ++i) {
jsonMessage = json::parse(message);
}
std::cout << jsonMessage["value"] << std::endl; // To avoid having the compiler optimize the loop away.
};
Compiling with clang 11.0.3 (std=c++17, -O3), this runs in ~1.4s on the same Macbook, that is to say a parsing speed of ~700k messages/second with even smaller messages than the Python example. I know that nlohmann/json can be quite slow, and was able to get parsing speeds of about 2M messages/second using simdjson's DOM API.
This is still far too slow for my use case. I am open to all suggestions to improve message parsing speed with potential applications in Python, C++, Java (or whatever JVM language) or Go.
Notes:
I do not necessarily care about the size of the messages on disk (consider it a plus if the storage method you suggest is memory-efficient).
All I need is a key-value model for basic data types - I do not need nested dictionaries or lists.
Converting the existing data is not an issue at all. I am simply looking for something read-optimized.
I do not necessarily need to parse the entire thing into a struct or a custom object, only to access some of the fields when I need it (I typically need a small fraction of the fields of each message) - it is fine if this comes with a penalty, as long as the penalty does not destroy the whole application's throughput.
I am open to custom/slightly unsafe solutions.
Any format I choose to use needs to be naturally delimited, in the sense that the messages will be written serially to a file (I am currently using one file per day, which is sufficient for my use case). I've had issues in the past with unproperly delimited messages (see writeDelimitedTo in the Java Protobuf API - lose a single byte and the entire file is ruined).
Things I have already explored:
JSON: experimented with rapidjson, simdjson, nlohmann/json, etc...)
Flat files with delimited msgpack (see this API: https://github.com/msgpack/msgpack-python#streaming-unpacking): what I am currently using to store the messages.
Protocol Buffers: slightly faster, but does not really fit with the unstructured nature of the data.
Thanks!!
I assume that messages only contain few named attributes of basic types (defined at runtime) and that these basic types are for example strings, integers and floating-point numbers.
For the implementation to be fast, it is better to:
avoid text parsing (slow because sequential and full of conditionals);
avoid checking if messages are ill-formed (not needed here as they should all be well-formed);
avoid allocations as much as possible;
work on message chunks.
Thus, we first need to design a simple and fast binary message protocol:
A binary message contains the number of its attributes (encoded on 1 byte) followed by the list of attributes. Each attribute contains a string prefixed by its size (encoded on 1 byte) followed by the type of the attribute (the index of the type in the std::variant, encoded on 1 byte) as well as the attribute value (a size-prefixed string, a 64-bit integer or a 64-bit floating-point number).
Each encoded message is a stream of bytes that can fit in a large buffer (allocated once and reused for multiple incoming messages).
Here is a code to decode a message from a raw binary buffer:
#include <unordered_map>
#include <variant>
#include <climits>
// Define the possible types here
using AttrType = std::variant<std::string_view, int64_t, double>;
// Decode the `msgData` buffer and write the decoded message into `result`.
// Assume the message is not ill-formed!
// msgData must not be freed or modified while the resulting map is being used.
void decode(const char* msgData, std::unordered_map<std::string_view, AttrType>& result)
{
static_assert(CHAR_BIT == 8);
const size_t attrCount = msgData[0];
size_t cur = 1;
result.clear();
for(size_t i=0 ; i<attrCount ; ++i)
{
const size_t keyLen = msgData[cur];
std::string_view key(msgData+cur+1, keyLen);
cur += 1 + keyLen;
const size_t attrType = msgData[cur];
cur++;
// A switch could be better if there is more types
if(attrType == 0) // std::string_view
{
const size_t valueLen = msgData[cur];
std::string_view value(msgData+cur+1, valueLen);
cur += 1 + valueLen;
result[key] = std::move(AttrType(value));
}
else if(attrType == 1) // Native-endian 64-bit integer
{
int64_t value;
// Required to not break the strict aliasing rule
std::memcpy(&value, msgData+cur, sizeof(int64_t));
cur += sizeof(int64_t);
result[key] = std::move(AttrType(value));
}
else // IEEE-754 double
{
double value;
// Required to not break the strict aliasing rule
std::memcpy(&value, msgData+cur, sizeof(double));
cur += sizeof(double);
result[key] = std::move(AttrType(value));
}
}
}
You probably need to write the encoding function too (based on the same idea).
Here is an example of usage (based on your json-related code):
const char* message = "\x01\x05value\x00\x05hello";
void bench()
{
std::unordered_map<std::string_view, AttrType> decodedMsg;
decodedMsg.reserve(16);
decode(message, decodedMsg);
for(size_t i=0; i<1000*1000; ++i)
{
decode(message, decodedMsg);
}
visit([](const auto& v) { cout << "Result: " << v << endl; }, decodedMsg["value"]);
}
On my machine (with an Intel i7-9700KF processor) and based on your benchmark, I get 2.7M message/s with the code using the nlohmann json library and 35.4M message/s with the new code.
Note that this code can be much faster. Indeed, most of the time is spent in efficient hashing and allocations. You can mitigate the problem by using a faster hash-map implementation (eg. boost::container::flat_map or ska::bytell_hash_map) and/or by using a custom allocator. An alternative is to build your own carefully tuned hash-map implementation. Another alternative is to use a vector of key-value pairs and use a linear search to perform lookups (this should be fast because your messages should not have a lot of attributes and because you said that you need a small fraction of the attributes per message).
However, the larger the messages, the slower the decoding. Thus, you may need to leverage parallelism to decode message chunks faster.
With all of that, this is possible to reach more than 100 M message/s.

receiving bytes in different sizes from c++ server to python client

I am trying to send a few images taken from a folder from C++ server to Python client.
I have managed to send/receive the sizes as integers, but now I have to send/receive the actual images.
Since the images have different sizes, I would like the client to split the bytes according the images sizes.
I am a bit lost, since I am using one parameter now e.g. recv(1024)
and I am receiving a lot more bytes than the ones I sent. So I am not really sure of what's happening.
Server
ifstream stream(nm, std::ios::in | std::ios::binary);
if(stream.is_open())
{
vector<char> imageDataVec((istreambuf_iterator<char>(stream)), istreambuf_iterator<char>());
cout << "Size=of=image=== " << imageDataVec.size() << " bytes";
long conv_num= htonl(imageDataVec.size());
//send(new_socket, &converted_number, sizeof(converted_number), 0);
//send(new_socket, &imageDataVec, imageDataVec.size() , 0);
//size_t sent{};
int nbytes=0;
while (1)
{
//send(new_socket, &conv_num, sizeof(conv_num), 0);
nbytes = send(new_socket, &imageDataVec, imageDataVec.size(), 0);
//continue;
if (nbytes <= 0) {
std::clog << "error: while sending image\n";
break;
}
else
{
//sent += nbytes;
cout<<nbytes<<"=====1=1=1=1========"<<"bytes"<<endl;}
break;
}
//fclose(fin);
}
else
{cout<<"can't open folder"<<endl;}
Client
while(1):
pic_bytes=s.recv(8)
pic_bytes_amount=int.from_bytes(pic_bytes, byteorder='big', signed=False)
print("received bytes======{}".format(pic_bytes_amount))
f=open('pic.jpeg','wb')
f.write(pic_bytes)
f.close()
1) it seems strange that you are writing address of a vector to socket.
i think send should look like this:
send(new_socket, imageDataVec.data()...);
2) as i understand, on client side you are trying to read 8 byte length integer. but i don't see where server writes this data.
3) use int64_t instead of long type as you cannot be sure about long size.

Differences between serial communication in matlab vs. python 3.7? Sending int values over 128 via python serial to an arduino

I have some matlab functions which I would like to translate into python 3.7. The functions calculate values for joint angles of a little robot from trossenrobotics and send those values via serial port to the robot which is controlled by an arduino board. The board runs a programm from Trossenrobotics, which interpretes the data send via the serial port and reacts accordingly.
I already managed to translate all the functions and they give the same outputs as the matlab functions, but the serial communication just doesn't work.
In matlab fwrite(s, int_value) and fread(s) are used for the communication. The int_values represent a highbyte and a lowbyte of a joint position(0-1024) and are send seperately.
In python I used pyserial and the functions s.write(byte) and s.read().
I converted the int values into bytes with chr(int).encode().
Since I was struggling with my actual objective, I first wanted to abstract it and make it simpler. Now I am just trying to turn on an LED on the arduino for 2 seconds, when a special byte is received and send the same byte back to python.
I noticed that as long as the value I am sending is smaller that 128 it works just fine, but when it's greater it won't work.
I printed the output of chr(255).encode() which is b'\xc3\xbf', what to me looked like it could be the problem.
The I tried using chr(255).encode('charmap') and printed it which gives back b'\xff', what looks right to me, but it still doesn't work for numbers between 128 and 255.
I also noticed, that when I send the data over the terminal
with s.write(chr(115).encode()) It doesn't return a value, but when I use
s.write(chr(255).encode('charmap')) it returns a 1.
Here's my python pogramm:
python
import serial
from time import sleep
port = 'COM4'
baudrate = 38400
s = serial.Serial(port,baudrate)
sleep(3)
m = 115
s.write(chr(m).encode())
while s.in_waiting == 0:
print('##### waiting #####')
sleep(2)
if s.in_waiting > 0:
r = int.from_bytes(s.read(), byteorder = 'big')
print(r)
s.close()
And here's the arduino programm:
C#
void setup() {
pinMode(13, OUTPUT);
digitalWrite(13,LOW);
Serial.begin(38400);
}
void loop() {
if (Serial.available() > 0)
{
if (Serial.read() == 's')
{
digitalWrite(13, HIGH);
Serial.write('s');
delay(2000);
}
}
else
{
digitalWrite(13, LOW);
}
}
My questions would be:
Regarding my primary problem (sending multiple bytes via matlab, python):
1) Does anybody know if there are any fundamental differences between serial communication in matlab and in python which could cause my problems?
Regarding my abstracted problem (sending one bye via python):
2) How can I send values greater than 128 (up to 255) via the serial port?
There is no fundamental difference between Python and Matlab on this regard.
But in your Matlab code it seems (I'm assuming because what you say):
The int_values represent a highbyte and a lowbyte of a joint position(0-1024) and are send seperately.
that you're sending an int16
to be able to fit up to 1024.
I have no idea what you're trying to do with chr but I have the feeling what you need is replace these lines:
m = 115
s.write(chr(m).encode())
With (on Python 3.x):
m=115
s.write(m.to_bytes(2, byteorder="big"))
That would write: b'\x00s', a mix of hex and ASCII, but you should not worry about that, because that is exactly the same as b'\x00\x73'
And if you do, then you can do: b'\x00s'==b'\x00\x73' and you'll get True.
Thanks for your answer! And sorry for the late reply.
I already tried that right at the beginning but always got an exception. It took me a while to figure out why. It was because I was using numpy.uint8() for my integer values.
After I removed it I didn't get any exception but it didn't work either.
I used chr() because it didn't throw an exception with the numpy.uint8() and honestly because I did not know what else to do...
Today I finally found the solution.
Here is the link to where I found it:
arduino.stackexchange.com
Using s.write(struct.pack('>B', int_value) works and seems to be the equivalent to matlabs fwrite(s, int_value).
Sorry If my question didn't make a lot of sense to you, I am just glad I finally figured it out.

Why NodeMCU sends data with unwanted number?

I am trying to send a serial data from NodeMCU to Arduino. I use MicroPython to program. As well as Serial.read on Arduino. I can send and receive successfully. But the problem is the NodeMCU sends data along with number which is not needed. And Arduino receives data along with number. For Example, if I send "Hello" it sends as "Hello5". I understood that the number is nothing but the number of alphabets in the string. How can I remove this?
MicroPython on NodeMCU:
import os
import machine
from machine import UART
uart = UART(0)
import time
while True:
uart.write('1')
Arduino program:
String received;
String msg;
void setup() {
Serial.begin(115200);
attachInterrupt(0, light, FALLING);//When arduino Pin 2 is FALLING from HIGH to LOW, run light procedure!
}
void light() {
Serial.println(msg);
}
void loop()
{
if (Serial.available() > 0){
received = Serial.readStringUntil('\n');
msg = received;
}
}
I just checked the microPython's UART (http://docs.micropython.org/en/latest/wipy/library/machine.UART.html) and Arduino's Serial (https://www.arduino.cc/en/Reference/Serial
), and it seems you're missing one initialization line for UART. UART document states the default baud rate it sets is 9600, and you expect a 115200 on serial receiver. I believe setting the baud rate different on each side will have undefined behavior.
In your python code, could you try uart.init(115200) after uart = UART(0) call (and the default values for the rest seems same as the Serial's expectations on receiver)?
Also, Serial document says that if it can't find the char you define in the readStringUntil(), then it'll try until it times out. So I guess your function call times-out because it won't find an endline ('\n') in the stream, because you didn't inject any.
In addition, although the help documents of the functionality you're using don't state such a thing, if you really always get the number of characters as the first char at the receiver, it might be worthwhile to try using that to your advantage. I wonder if you can try to get that number first, then read that many chars afterwards (at the Arduino receiver site). Here's some code I hope may help (I'm afraid I didn't try using it):
#include <string.h>
char buffer[256]; // buffer to use while reading the Serial
memset(buffer, (char)0, 256); // reset the buffer area to all zeros
void loop()
{
if (Serial.available() > 0){
int count = Serial.read(); // the first byte that shows the num of chars to read after, assuming that this is a 'byte' - which means we can have max 256 chars in the stream
Serial.readBytes(buffer, count);
msg = String(buffer);
}
}

Sending float type data from Arduino to Python

I need to send float data to Arduino from Python and get the same value back. I thought to send some float data from the Arduino first. The data is sent as 4 successive bytes. I'm trying to figure out how to collect these successive bytes and convert it to proper format at the Python end (system end)
Arduino code:
void USART_transmitdouble(double* d)
{
union Sharedblock
{
char part[4];
double data;
} my_block;
my_block.data = *d;
for(int i=0;i<4;++i)
{
USART_send(my_block.part[i]);
}
}
int main()
{
USART_init();
double dble=5.5;
while(1)
{
USART_transmitdouble(&dble);
}
return 0;
}
Python code (system end):
my_ser = serial.Serial('/dev/tty.usbmodemfa131',19200)
while 1:
#a = raw_input('enter a value:')
#my_ser.write(a)
data = my_ser.read(4)
f_data, = struct.unpack('<f',data)
print f_data
#time.sleep(0.5)
Using the struct module as shown in the above code is able to print float values.
50% of the time,the data is printed correctly. However, if I mess with time.sleep() or stop the transmission and restart it, incorrect values are printed out. I think the wrong set of 4 bytes are being unpacked in this case. Any idea on what we can do here?
Any other ideas other than using struct module to send and receive float data to and from Arduino?
Well, the short answer is there's some interaction going on between software and hardware. I'm not sure how you're stopping the transmission. I suspect whatever you're doing is actually stopping the byte being sent mid-byte therefore inject a new byte when you start back up. The time.sleep() part could be that some hardware buffer is getting overflowed and you're losing bytes which causes an alignment offset. Once you start grabbing a few bytes from one float and a few bytes from another you'll start getting the wrong answer.
One thing I've noticed is that you do not have any alignment mechanism. This is often hard to do with a UART because all you can send are bytes. One way would be to send a handshake back and forth. Computer says restart, hardware restarts the connection (stops sending stuff, clears w/e buffers it has, etc) and sends some magic like 0xDEADBEEF. Then the computer can find this 0xDEADBEEF and know where the next message is going to start. You'll still need to be aware of whatever buffers exist in the hardware/OS and take precautions to not overflow them. There are a number of flow control methods ranging for XON/XOFF to actual hardware flow control.
Because this question ranks highly on search engines I have put together a working solution.
WARNING: Unless you need to full floating point precision, convert to a string and send that (either using sprintf or dtostrf, or use Serial.print(value,NumberOfDecimalPlaces) (documentation) ). This is because the following solution a) Wont work for machines of different endianess and b) some of the bytes may be misinterpreted as control characters.
Solution: Get the pointer for the floating point number and then pass it as a byte array to Serial.write().
e.g.
/*
Code to test send_float function
Generates random numbers and sends them over serial
*/
void send_float (float arg)
{
// get access to the float as a byte-array:
byte * data = (byte *) &arg;
// write the data to the serial
Serial.write (data, sizeof (arg));
Serial.println();
}
void setup(){
randomSeed(analogRead(0)); //Generate random number seed from unconnected pin
Serial.begin(9600); //Begin Serial
}
void loop()
{
int v1 = random(300); //Generate two random ints
int v2 = random(300);
float test = ((float) v1)/((float) v2); // Then generate a random float
Serial.print("m"); // Print test variable as string
Serial.print(test,11);
Serial.println();
//print test variable as float
Serial.print("d"); send_float(test);
Serial.flush();
//delay(1000);
}
Then to receive this in python I used your solution, and added a function to compare the the two outputs for verification purposes.
# Module to compare the two numbers and identify and error between sending via float and ASCII
import serial
import struct
ser = serial.Serial('/dev/ttyUSB0', 9600) // Change this line to your port (this is for linux ('COM7' or similar for windows))
while True:
if(ser.inWaiting() > 2):
command = ser.read(1) #read the first byte
if (command == 'm'):
vS = ser.readline()
#
ser.read(1)
data = ser.read(4)
ser.readline()
vF, = struct.unpack('<f',data)
vSf = float(vS)
diff = vF-vSf
if (diff < 0):
diff = 0-diff
if (diff < 1e-11):
diff = 0
print "Str:", vSf, " Fl: ", vF, " Dif:", diff
References:
Sending a floating point number from python to arduino and
How to send float over serial
I don't know Python, however, what is wrong with the Arduino sending the number like this:
value= 1.234;
Serial.println(value);
For the Arduino to receive a float:
#include <stdio.h>
#include <stdlib.h>
void loop() {
char data[10], *end;
char indata;
int i=0;
float value;
while ((indata!=13) & (i<10)) {
if (Serial.available() > 0) {
indata = Serial.read();
data[i] = indata;
i++;
}
}
i-=1;
data[i] = 0; // replace carriage return with 0
value = strtof(data,&end);
}
Note this code is untested although very similar to code I have used in the past.

Categories