Writing fast serial data to a file (csv or txt)

Writing fast serial data to a file (csv or txt) - python

Is there a way to capture and write very fast serial data to a file?
I'm using a 32kSPS external ADC and a baud rate of 2000000 while printing in the following format: adc_value (32bits) \t millis()
This results in ~15 prints every 1 ms. Unfortunately every single soulution I have tried fails to capture and store real time data to a file. This includes: Processing sketches, TeraTerm, Serial Port Monitor, puTTY and some Python scripts. All of them are unable to log the data in real time.
Arduino Serial Monitor on the other hand is able to display real time serial data, but it's unable to log it in a file, as it lacks this function.
Here's a printscreen of the serial monitor in Arduino with the incoming data:

One problematic thing is probably that you try to do a write each time you receive a new record. That will waste a lot of time writing data.
Instead try to collect the data into buffers, and as a buffer is about to overflow write the whole buffer in a single and as low-level as possible write call.
And to not stop the receiving of the data to much, you could use threads and double-buffering: Receive data in one thread, write to a buffer. When the buffer is about to overflow signal a second thread and switch to a second buffer. The other thread takes the full buffer and writes it to disk, and waits for the next buffer to become full.

After trying more than 10 possible solutions for this problem including dedicated serial capture software, python scripts, Matlab scripts, and some C projects alternatives, the only one that kinda worked for me proved to be MegunoLink Pro.
It does not achieve the full 32kSPS potential of the ADC, rather around 12-15kSPS, but it is still much better than anything I've tried.
Not achieving the full 32kSPS might also be limited by the Serial.print() method that I'm using for printing values to the serial console. By the way, the platform I've been using is ESP32.
Later edit: don't forget to edit MegunoLinkPro.exe.config file in the MegunoLink Pro install directory in order to add further baud rates, like 1000000 or 2000000. By default it is limited to 500000.

Related

Python - Plot.ly - MySQL Real time streaming visualization

Hope you're well and thanks for reading.
Been revisiting an old project, leveraging plotly to stream data out of mysql with python in-between the two. I've never had great luck w/ plot.ly (which I'm sure relates more to my understanding than their platform), streams/iframes seem to stall over time and I am not apt enough to troubleshoot completely.
My current symptom is this: Plots arbitrarily stall - I'm pushing data, but the iframe isn't updating.
The current solution is: Refresh the browser every X minutes.
The solution works, but it's aggrevating, because I dont understand why the visual is stalling in the first place (is it me, is it them, etc).
As I was reviewing some of the documentation, specifically this link:
https://plot.ly/streaming/
I noticed they call out NOT continually opening and closing streams, and that heartbeats should be placed every so often to keep things alive/fresh.
Here's what I'm currently calling every 10 minutes:
pullData(mysql)
format data
open(plotly.stream1)
write data to plotly.stream1
close(plotly.stream1)
open(plotly.stream2)
write data to plotly.stream2
close(plotly.stream2)
Based on what I am reading, it sounds like I should actually execute the script once on startup, and keep the streams open, but heartbeat() them every 15 or-so seconds between actual write() calls like this:
open(plotly.stream1)
open(plotly.stream2)
every 10 minutes:
pullData(mysql)
format data
write data to plotly.stream1
write data to plotly.stream2
while not pulling and writing:
every 15 seconds:
heartbeat(plotly.stream1)
heartbeat(plotly.stream2)
if error:
close(plotly.stream1)
close(plotly.stream2)
Please excuse the sudo-mess, I'm just trying to convey an idea. Anyone have any advice? I started on my original path of opening, writing, closing based on the streaming example, but that's a one time write. The other example is a constant stream of data. I'm somewhere in between those two.
Furthermore - is this train of thought even related to the iframe not refreshing? Part of me believes the symptom is unrelated to my idea - the data is getting to plot.ly fine - it's my session that's expiring, or the iframe "connection" that's going stale. If the symptom is unrelated, at least I'll have made my source code a bit cleaner and more appropriate.
Any advice is greatly appreciated!
Thanks
-justin

Plotly will close a stream that is inactive for more than 60 seconds. You must send a newline down the streaming channel (a heartbeat) to keep it open. I recommend every 30 seconds.
Your first code example may not work as expected because the client side websocket (that connects the Plot to our system) may close when your first source stream (the stream that connects your script to our system) exits. When you disconnect a source stream a signal is sent to our system that lets it know your stream is now inactive. If a new source stream does not reconnect quickly we close the client connecting websockets.
Now, when your script gets more data and opens a new stream it will successfully stream data to our system but the client-side websocket, now closed, will not pass the data to the Plot. We will cache a certain amount of points for you behind the scenes so that when you refresh the page the websocket reconnects and you get the last n points (where n is set by max-points in the API call).
This is why sending the heartbeat is important. We keep the source stream open and that in turn ensures that all the connected Clients keep their websockets open.
This isn't necessarily the most robust behaviour for a streaming platform to have and we will likely make it better in the future. For now though you will likely see better results by attempting to implement the code in your second example.
Hope that helped!

how do I monitor data from serial port and have certain data act as a flag in Python

I am working on a project for work and I am stuck on a part where I need to monitor a serial line and listen for certain words using python
so the setup is that we have a automated RAM testing machine that tests RAM one module at a time and interacts with software that came with the machine via serial. The software that came with the RAM tester is for monitoring/configuring the testing process, it also displays all of the information from the SPD chip from each module. while the RAM tester was running I ran a serial port monitoring program and I was able to see the same information that it displays in the software. The data I'm interested in is the speed of the RAM and the pass/fail result, both of which I was able to find in the data I monitored coming over the serial line. There are only 5 different speeds of RAM that we test, so I was hoping to have python monitor the serial line and wait for the speed of the RAM and the pass/fail results to come across. once python detects the speed of the RAM, and if it passes, I will have python write to an Arduino, and the Arduino will control a conveyor belt that will sort the ram by speed.
my idea is to have a variable for each of the RAM speeds and set the variables to 0. when python detects the RAM speed from the serial line it will set the corresponding variable to 1. then when the test is over the results, either pass or fail, will come over the serial line. this is where I am going to try to use a if statement. I imagine it would look something like this:
if PC-6400 == 1 and ser.read() == pass
ser.write(PC-6400) #serial write to the arduino
I know the use of the ser.read() == pass is incorrect and that's where I'm stuck. I do not know how to use a ser.read() function to look for certain words. I need it to look for the ram speed (in this case its PC-6400) and the word pass but I have not been successful in getting it to find either. I am currently suck in is getting python to detect the RAM speed so it can change the value of the variable. would it be something close to this?
if ser.read() == PC-6400
PC-6400 = 1
This whole thing is a bit difficult for me to explain and I hope it all makes sense, I thank you in advance if anyone can give me some advice on how to get this going. I am pretty new to python and this is the most adventurous project I have worked on using python so far.

I'm still a bit confused, but here's a very basic example to hopefully get you started.
This will read from a serial port. It will attempt to read 512 bytes (which just means 512 characters from a string). If 512 bytes aren't available then it will wait forever, so make sure you set a timeout when you made the serial connection.
return_str = ser.read(size = 512)
You can also see how many bytes are waiting to be read:
print "num bytes available = ", ser.inWaiting()
Once you have a string, you can check words within the string:
if "PASS" in return_str:
print "the module passed!"
if "PC-6400" in return_str:
print "module type is PC-6400"
To do something similar, but with variables:
# the name pass is reserved
pass_flag = "PASS"
PC6400 = 0
if pass_flag in return_str and "PC-6400" in return_str:
PC6400 = 1
Keep in mind that you it is possible to read part of a line if you are too quick. You can add delays by using timeouts or the time.sleep() function. You might also find you need to wait for a couple of seconds after initiating the connection as the arduino resets when you connect. This will give it time to reset before you try and read/write.

How to structure my Python code?

I apologize in advance for this being a bit vague, but I'm trying to figure out what the best way is to write my program from a high-level perspective. Here's an overview of what I'm trying to accomplish:
RasPi takes input from altitude sensor on serial port at 115000 baud.
Does some hex -> dec math and updates state variables (pitch, roll, heading, etc)
Uses pygame library to do some image manipulation based on the state variables on a simulated heads up display
Outputs the image to a projector at 30 fps.
Note that there's no user input (for now).
The issue I'm running into is the framerate. The framerate MUST be constant. I'd rather skip a data packet than drop a frame.
There's two ways I could see structuring this:
Write one function that, when called, grabs data from the serial bus and spits out the state variables as the output. Then write a pygame loop that calls this function from inside it. My concern with this is that if the serial port starts being read at the end of an attitude message, it'll have to pause and wait for the message to start again (fractions of a second, but could result in a dropped frame)
Write two separate modules, both to be running simultaneously. One continuously reads data from the serial port and updates the state variables as fast as possible. The other just does the image manipulation, and grabs the latest state variables when it needs them. However, I'm not actually sure how to write a multithreaded program like this, and I don't know how well the RasPi will handle such a program.

I don't think that RasPi would work that well running multithreaded programs. Try the first method, though it would be interesting to see the results of a multithreaded program.

Using Python File Operations

So I have two scripts. Main.py, which is ran upon startup and is ran in the background. otherscript.py which is ran whenever the user invokes it.
main.py crunches some data then writes it out to a file every iteration of the while loop. (this data is about ~ 1.17 mb), and erases old data. So data.txt contains the latest crunched data.
otherscript.py will read data.txt (the current data at that instant) then do something with it.
main.py
while True:
file = "data.txt"
data = crunchData()
file.write(data)
otherscript.py
data = file.read("data.txt")
doSomethingWithData(data)
How can I make the connection between the two scripts process faster? Are there any alternatives to file writing the data?

This is a problem of Inter-Process Communication (IPC). In your case, you basically have a producer process, and a consumer process.
One way of doing IPC, as you've found, is using files. However, it'll saturate the disk quickly if there's lots of data going through.
If you had a straight consumer that wanted to read all the data all the time, the easiest way to do this would probably be a pipe - at least if you're on a unix platform (mac, linux).
If you want a cross-platform solution, my advice, in this case, would be to use a socket. Basically, you open a network port on the producer process, and every time a consumer connects, you dump the latest data. You can find a simple how-to on sockets here.

Pcap file replay based on the packet timestamp using scapy

I have one pcap file (~90M), and i want to replay that file. I came across scapy and it provides the way to read the pcap file and replay it. I tried following two ways to replay the packets
sendp(rdpcap(<filename>)
and
pkts = PcapReader(<filename>);
for pkt in pkts:
sendp(pkt)
First one game me memory error, memory consumption of the python process went up to 3 gig and finally it died. But second option worked fine for me because it did not read the whole file into memory. I have following three question
Is 90M pcap file is too big for scapy to replay?
Whenever we use tcpdump/wireshark, every packet has its timestamp associated with it. Assume packet 1 came at time T and packet 2 came at time T+10, will scapy replay the packets in similar manner, first packet at time T and second at T+10? or it will just keep sending them in loop, i think later is the case with PcapReader.
If the answer is no for above question ( its just replay in loop, without considering the packet inter arrival time), do we have any other python library which can do this job for me? Even python is not the constraint for me.

To answer your first question, well it sounds like you answered it yourself! Try running the first option again on another pcap file that's 40-50 MB instead and see if that errors out. That way you can at least check it the file is to big for your system in combination with Scapy to handle (not enough RAM in your system to handle how Scapy runs its algorithms as it was built to handle a few packets at a time, not a 90MB pcap file) or if it's just something in the code.
To answer your second question, based off of reading's I've been doing on Scapy over the past few weeks I strongly believe that this is a yes. However, I don't know of any sources off the top of my head to back up this verification.
Ninja edit - I saw this on another StackOverflow question - Specify timestamp on each packet in Scapy?
While that is for a single packet - if every packet is timestamped within Scapy then I imagine that it would be the same for every packet in a large pcap file that you read in. In this way when you replay the packets it should go in the same order.
A lot of educated guessing going on in this answer, hope it helps you though!

No. It shouldn't take up 3GBs of memory. I frequently open up larger pcap files on machines with only 2GBs of memory. Just try doing pkts = rdpcap(<filename>) to see how much memory that takes, then go from there. If the problem persists, you may wish to try different versions of scapy.
No, sendp() does not do this by default. You could try the realtime parameter (type help(sendp) on the console). But overall, based on my experience, scapy isn't so good at keeping accurate timing.
tcpreplay (linux CLI tool) is what I use. It has many options including various time keeping mechanisms.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.