I have the following Python script:
#!/usr/bin/env python
# coding: utf-8
import time
import serial
import datetime
from datetime import timedelta
import os.path
PATH = '/home/pi/test/'
Y = datetime.datetime.now().strftime('%Y')[3]
def get_current_time():
return datetime.datetime.now()
def get_current_time_f1():
return datetime.datetime.now().strftime('%Y/%m/%d %H:%M:%S')
def get_current_time_f2():
return datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
def compare_time():
current_time = get_current_time()
if current_time.minute == 59 and current_time.second >= 45:
return "new"
def file_date():
if compare_time() == "new":
plusonehour = datetime.datetime.now() + timedelta(hours=1)
return plusonehour.strftime('Z'+Y+'%m%d%H')
else:
return datetime.datetime.now().strftime('Z'+Y+'%m%d%H')
def createCeilFile():
filename = os.path.join(PATH, file_date()+".dat")
fid = open(filename, "w")
fid.writelines(["-Ceilometer Logfile","\n","-File created: "+get_current_time_f1(),"\n"])
return fid
# open the first file at program start
fid = createCeilFile()
# serial port settings
ser=serial.Serial(
port='/dev/ttyUSB0',
baudrate = 19200,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
bytesize=serial.EIGHTBITS,
)
counter=0
# read first byte, grab date string, read rest of string, print both in file
while 1:
tdata = ser.read()
time.sleep(3)
data_left = ser.inWaiting()
tdata += ser.read(data_left)
fid.writelines(["-"+get_current_time_f2(),"\n",tdata,"\n"])
#should have ~10 secs before next message needs to come in
#if next string will go into the next hour
if compare_time() == "new":
# close old file
fid.writelines(["File closed: "+get_current_time_f2(),"\n"])
fid.close()
# open new file
fid = createCeilFile()
# then it goes back to 'tdata = ser.read()' and waits again.
It works fine and stores all the data I need in the correct format and so on.
A data message from the device comes trough every 15 seconds. The python script runs for an infinite time and reads those messages. At the beginning of each message the script adds a time, when the message was written to the file and therefor received. And the time is the problem with this script. I have a time drift of about 3 to 4 seconds in 24 hours. Weird about that is, that the time drifts backwards. So if I start with data messages coming in at 11, 26, 41 and 56 seconds during the minute, after 24 hours the messages seem to come in at 8, 23, 38 and 53 seconds in the minute.
Has anyone an explanation for that or maybe a way to compensate it? I thought about restarting the program every hour, after it saved the hourly file. Maybe that helps resetting the weird time drift?
Related
I'm trying to record some audio by using the sounddevice library in python. This is to create a sound-activated recording for a scanner.
I cannot use the record function as it requires a time specified which is not suitable for my program as each transmission may be of different length, so I added a callback to the stream and when the audio is above a certain level will trigger the recording by creating a new file and writing the buffer data to the wave file.
I have implemented the trigger successfully, however when I attempt to write to the file and open it, it is not what the transmission was, instead it is just corrupt static.
Here's my code:
import time as timer
import wave
import numpy as np
import sounddevice as sd
RecordActivate = False
StartTime = 0
def print_sound(indata, outdata, frames, time, status):
global RecordActivate
global StartTime
volume_norm = np.linalg.norm(indata) * 10
VolumeLevel = int(volume_norm)
global f
if RecordActivate == False:
print("itls false" + str(VolumeLevel))
if VolumeLevel > 16:
RecordActivate = True
print("Begin")
StartTime = timer.time()
ThingTime = timer.strftime(
"%Y-%m-%d %H:%M:%S", timer.localtime(timer.time())
)
print("Transmission detected at " + ThingTime)
f = wave.open("scan.wav", "w")
f.setnchannels(1)
f.setsampwidth(1)
f.setframerate(44100)
if RecordActivate == True:
if VolumeLevel < 16:
print("Transmission ceased.")
RecordActivate = False
else:
f.writeframes(indata.tobytes())
print("recording..")
with sd.Stream(callback=print_sound):
while True:
thing = 0
I'm using this python script:
LINK
It's working great so far.
But now I would like to optimize it, because sometimes it's happening that the script will be executed 2-3 times within 10-20 minutes, because it will always run if there are 3 streams or more (e.g. a 4. stream will be started --> notification will be send again or also if a user decide to cancel this stream and watch another movie --> The script will run again!)
I have tried to use time.sleep but that is not working. I would like to have it like this:
If the program will be executed,it shouldn't be run again within the next 60 minutes.
What do I need to use / code here?
Thanks for help!
Thank you for the Tip, my code does look like this now (can you maybe check?):
** code section ** = my code which I have merged inside the existing script.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Description: Send a PlexPy notification when the total
# number of streams exceeds a threshold.
# Author: /u/SwiftPanda16
# Requires: requests
# PlexPy script trigger: Playback start
# PlexPy script arguments: {streams}
import requests
import sys
**import os
from datetime import datetime, timedelta**
### EDIT SETTINGS ###
PLEXPY_URL = 'xx.xxx.xx:8181'
PLEXPY_APIKEY = 'xxxxxxxxxxxxxxxxxx'
AGENT_ID = 14 # The PlexPy notifier agent id found here: https://github.com/JonnyWong16/plexpy/blob/master/API.md#notify
NOTIFY_SUBJECT = 'test' # The notification subject
NOTIFY_BODY = 'Test'
STREAM_THRESHOLD = 3
**### time management ###
one_hour_ago = datetime.now() - timedelta(minutes=60)
filetime = datetime.fromtimestamp(os.path.getctime("timestamp.txt"))
if filetime < one_hour_ago:**
### CODE BELOW ###
def main():
try:
streams = int(sys.argv[1])
except:
print("Invalid PlexPy script argument passed.")
return
if streams >= STREAM_THRESHOLD:
print("Number of streams exceeds {threshold}.".format(threshold=STREAM_THRESHOLD))
print("Sending PlexPy notification to agent ID: {agent_id}.".format(agent_id=AGENT_ID))
params = {'apikey': PLEXPY_APIKEY,
'cmd': 'notify',
'agent_id': AGENT_ID,
'subject': NOTIFY_SUBJECT,
'body': NOTIFY_BODY}
r = requests.post(PLEXPY_URL.rstrip('/') + '/api/v2', params=params)
**os.getcwd()
open ('timestamp.txt', 'w')**
else:
print("Number of streams below {threshold}.".format(threshold=STREAM_THRESHOLD))
print("No notification sent.")
return
if __name__ == "__main__":
main()
**else:
pass**
Have the script write a timestamp to an external file and check that file at startup.
Here is an example:
import time
def script_has_run_recently(seconds):
filename = 'last-run-time.txt'
current_time = int(time.time())
try:
with open(filename, 'rt') as f:
last_run = int(f.read().strip())
except (IOError, ValueError) as e:
last_run = 0
if last_run + seconds > current_time:
return True
else:
with open(filename, 'wt') as f:
f.write(str(current_time))
return False
def main():
print('running the main function.')
if __name__ == "__main__":
seconds = 3600 # one hour in seconds
if script_has_run_recently(seconds):
print('you need to wait before you can run this again')
else:
main()
I am comparing scapy and dpkt in terms of speed. I have a directory with pcap files which I parse and count the http requests in each file. Here's the scapy code :
import time
from scapy.all import *
def parse(f):
x = 0
pcap = rdpcap(f)
for p in pcap:
try:
if p.haslayer(TCP) and p.getlayer(TCP).dport == 80 and p.haslayer(Raw):
x = x + 1
except:
continue
print x
if __name__ == '__main__':\
path = '/home/pcaps'
start = time.time()
for file in os.listdir(path):
current = os.path.join(path, file)
print current
f = open(current)
parse(f)
f.close()
end = time.time()
print (end - start)
The script is really slow (it gets stuck after a few minutes) compared to the dpkt version :
import dpkt
import time
from os import walk
import os
import sys
def parse(f):
x = 0
try:
pcap = dpkt.pcap.Reader(f)
except:
print "Invalid Header"
return
for ts, buf in pcap:
try:
eth = dpkt.ethernet.Ethernet(buf)
except:
continue
if eth.type != 2048:
continue
try:
ip = eth.data
except:
continue
if ip.p == 6:
if type(eth.data) == dpkt.ip.IP:
tcp = ip.data
if tcp.dport == 80:
try:
http = dpkt.http.Request(tcp.data)
x = x+1
except:
continue
print x
if __name__ == '__main__':
path = '/home/pcaps'
start = time.time()
for file in os.listdir(path):
current = os.path.join(path, file)
print current
f = open(current)
parse(f)
f.close()
end = time.time()
print (end - start)
So it there something wrong with the way I am using scapy? Or is it just that scapy is slower than dpkt?
You inspired me to compare. 2 GB PCAP. Dumb test. Simply counting the number of packets.
I'd expect this to be in single digit minutes with C++ / libpcap just based on previous timings of similar sized files. But this is something new. I wanted to prototype first. My velocity is generally higher in Python.
For my application, streaming is the only option. I'll be reading several of these PCAPs simultaneously and doing computations based on their contents. Can't just hold in memory. So I'm only comparing streaming calls.
scapy 2.4.5:
from scapy.all import *
import datetime
i=0
print(datetime.datetime.now())
for packet in PcapReader("/my.pcap"):
i+=1
else:
print(i)
print(datetime.datetime.now())
dpkt 1.9.7.2:
import datetime
import dpkt
print(datetime.datetime.now())
with open(pcap_file, 'rb') as f:
pcap = dpkt.pcap.Reader(f)
i=0
for timestamp, buf in pcap:
i+=1
else:
print(i)
print(datetime.datetime.now())
Results:
Packet count is the same. So that's good. :-)
dkpt - Just under 10 minutes.
scapy - 35 minutes.
dkpt went first. So if disk cache were helping a package, it would be scapy. And I think it might be marginally. I did this previously with scapy only, and it was over 40 minutes.
In summary, thanks for your 5 year old question. It's still relevant today. I almost bailed on Python here because of the overly long read speeds from scapy. dkpt seems substantially more performant.
Side note, alternative packages:
https://pypi.org/project/python-libpcap/ I'm on python 3.10 and 0.4.0 seems broken for me, unfortunately.
https://pypi.org/project/libpcap/ I'd like to compare timings to this, but have found it much harder to get a minimal example going. Haven't spent much time though, to be fair.
I have an Arduino that reports time (in seconds), voltage, current and joules ever 60 seconds. In the serial monitor like this:
time,voltage,current,joules
60,1.45,0.39,0.57
120,1.45,0.39,1.13
180,1.45,0.39,1.70
240,1.45,0.39,2.26
...
However the following python script I don't get this result:
import serial
ser = serial.Serial('COM5', 9600)
logfile = open("batterytest.log", 'w')
while True:
if ser.readline() == b'Test Complete!':
logfile.close()
exit()
logfile.write(ser.readline().decode("utf-8"))
logfile.flush()
Instead I see results every 120 seconds:
time,voltage,current,joules
120,1.13,0.02,0.05
240,1.13,0.02,0.09
360,1.13,0.02,0.14
480,1.13,0.02,0.19
....
Looks like it may miss the in-between data point due to some timing issue. You can try to use putty to see if your arduino in fact output the right data points.
For your PySerial program, I would add a variable "data" to store your serial readline first, then perform your logic on it.
import serial
ser = serial.Serial('COM5', 9600)
logfile = open("batterytest.log", 'w')
while True:
data = ser.readline()
if data == b'Test Complete!':
logfile.close()
exit()
logfile.write(data.decode("utf-8"))
logfile.flush()
Also, depending on your Arduino output timing, you may consider adding a timeout value for your serial read by:
ser = serial.Serial('COM5', 9600, timeout = 1 )
# Here the time out is 1 second
I have a program in place that follows logic like this:
At the start of every hour, multiple directories receive a file that is continuously fed data. I'm developing a simple program that can read all the files simultaneously and abstracted the tailing/reading part into a function, lets call it 'tail' for now. The external program feeding the data doesn't always run smoothly. Sometimes a file will come in late, sometimes the next hour will hit and the program still feeds the stale file data. I can't afford to lose the data. My solution looks something like this using multiprocessing.pool using pseudo code in parts of it:
def process_data(logfile):
num_retries = 5
while num_retries > 0:
if os.path.isfile(logfile):
for record in tail(logfile):
do_something(record)
else:
num_retries -= 1
time.sleep(30)
def tail(logfile):
logfile = open(logfile, 'r')
logfile.seek(0, 2)
while True:
line = logfile.readline()
if line:
wait_time = 0
yield line
else:
if wait_time >= 360:
break
wait_time += 1
time.sleep(1)
continue
if __name__ == '__main__':
start_time = sys.argv[1]
next_hour = None
while True:
logdirs = glob.glob("/opt/logs/plog*")
current_date = datetime.now()
current_hour = current_date.strftime('%H')
current_format = datetime.now().strftime("%Y%m%d%H")
logfiles = [logdir + '/some/custom/path/tofile.log' for logdir in logdirs]
if not next_hour:
next_hour = current_date + timedelta(hours=1)
if current_hour == next_hour.strftime('%H') or current_hour == start_time:
start_time = None
pool = multiprocessing.Pool()
pool.map(process_data, logfiles)
pool.close()
pool.join()
next_hour = current_date + timedelta(hours=1)
time.sleep(30)
Here's what I'm observing when I have logging implemented at the process level:
all files in each directory are getting read appropriately
when the next hour hits, there's a delay of 360s (6 minutes) before the next set of files get read
so if hour 4 ends, a new pool doesn't get created for hour 5 until processes for hour 4 finish
What I'm looking for: I'd like to keep using multiprocessing, but can't figure out why the code inside the main while loop doesn't go through until the previous Pool of processes finishes. I have tried the hourly logic for other examples without multiprocessing and have had it work fine. I'm lead to believe that this has to do with the Pool class and hoping to get advice on how to make it so that even while the previous Pool is active, I can create a new Pool for the new hour and begin processing new files even if it means this creates a ton of processes.