I want to do something like this:
try:
pid = int(file(lock_file, "r").read())
print "%s exists with pid: %s" % (lock_file, pid)
if not check_pid(pid):
print "%s not running. Phantom lock file? Continuing anyways" % pid
elif wall_time(pid) > 60 * 5:
print "%s has been running for more than 5 minutes. Killing it" % pid
os.kill(pid)
else:
print "Exiting"
sys.exit()
except IOError:
pass
lock = file(lock_file, "w")
lock.write("%s" % os.getpid())
lock.close()
How do I implement wall_time? Do I have to read from /proc or is there a better way?
Perhaps you could look at the creation time of the lock file. This wouldn't be guaranteed correct, but it would be correct in most cases (and the consequences of getting it wrong are minimal).
If you do not want to use the modify time of the lockfile for some reason, you could just write it down in the file:
pid, start_time = map(int, file(lock_file, "r").read().split())
...
lock.write("%s %d" % (os.getpid(), time.time()))
Related
I wrote (copied and adapted) a python program to do some backups. It basically reads the files from a source location and copies or creates hard links in a destination. It is multi-threaded and that part works ok. The problem is that the loop that is called in the threads never ends. The relevant code is below. I read that it might be one of the threads raising an error and hanging but don't know how to check that. The program reads from a file queue, process the files, then is supposed to end. It reads and processes the files till the queue is empty and then just hangs. Any suggestions on what to try or to look at? hopefully i pasted enough code to help
print ("beginning backup")
backup(path_to_source, path_to_full, dest_path, backup_type)
print ("Backup finished")
# --- clean full backup
if backup_type == "Full":
print ("Cleaning up full backup - deleting deleted files")
clean_backup()
print ("Backup has been cleaned. Done")
#--------------------------------------------------
def backup(path_to_source, path_to_full, dest_path, backup_type):
threadWorkerCopyInc(source_files)
print ("All Done") #(never executes)
def threadWorkerCopyInc(fileNameList):
global num_threads
#fileQueue = queue.queue()
for i in range(num_threads):
t = threading.Thread(target=IncCopyWorker)
t.daemon = True
t.start()
#threads.append(t)
for fileName in fileNameList:
fileQueue.put(fileName)
fileQueue.join()
#for i in range(len(threads)):
# fileQueue.put('None')
#for t in threads:
# t.join()
print('done with threadworkercopyinc') #(never executes)
def IncCopyWorker():
print ('Starting IncCopyWorker. ') #executes
while True:
filename = fileQueue.get()
if filename == 'None':
print("\nFilename is none") #executes when fileQueue.qsize() = 0
with threadlock: processedfiles +=1
print ("Files to go: %d processed files: %d newfiles: %d hardlinks: %d not copied: %d dfsr: %d " %(fileQueue.qsize(), processedfiles, newfiles, hardlinks, notcopied, dfsr), end = '\r')
is_dfsr = filename.upper().find("DFSR")
if (is_dfsr == -1):
#do main processing here
else:
with threadlock: dfsr+=1
fileQueue.task_done()
print ("done with while true loop") #never Executes
So it looks like my problem was that i included continue statements in my try/except blocks to try and speed up execution. When there is a continue or break statement the loop never ended. When i took out the continue statements it finished the way it was supposed to. I don't know the reason -just the solution
if __name__=='__main__':
print("================================================= \n")
print 'The test will be running for: 18 hours ...'
get_current_time = datetime.now()
test_ended_time = get_current_time + timedelta(hours=18)
print 'Current time is:', get_current_time.time(), 'Your test will be ended at:', test_ended_time.time()
autodb = autodb_connect()
db = bw_dj_connect()
started_date, full_path, ips = main()
pid = os.getpid()
print('Main Process is started and PID is: ' + str(pid))
start_time = time.time()
process_list = []
for ip in ips:
p = Process(target=worker, args=(ip, started_date, full_path))
p.start()
p.join()
child_pid = str(p.pid)
print('PID is:' + child_pid)
process_list.append(child_pid)
child = multiprocessing.active_children()
print process_list
while child != []:
time.sleep(1)
child = multiprocessing.active_children()
print ' All processes are completed successfully ...'
print '_____________________________________'
print(' All processes took {} second!'.format(time.time()-start_time))
I have got a python test script which should be running for 18 hours and then kill itself. The script uses multiprocessing for multi devices. The data I am getting from main() function will be changed by time.
I am passing these three args to worker method in multiprocessing.
How can I achieve that ?
if you don't need to worry about cleanup too much on the child processes you can kill them using .terminate()
...
time.sleep(18 * 60 * 60) # go to sleep for 18 hours
children = multiprocessing.active_children()
for child in children:
child.terminate()
for child in multiprocessing.active_children():
child.join() # wait for the children to terminate
if you do need to do some cleanup in all the child processes then you need to modify their run loop (I'm assuming while True) to monitor the time passing and only have the second while loop above in the main program, waiting for the children to go away on their own.
you are never comparing datetime.now() to test_ended_time.
# check if my current time is greater than the 18 hour check point.
While datetime.now() < test_ended_time and multiprocessing.active_children():
print('still running my process.')
sys.exit(0)
I have a script that checks a gmail account using the imap IDLE protocol. To do this I use imaplib2, hosted here. Every so often it throws an unhandled exception:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\imaplib2\imaplib2.py", line 1830, in _reader
raise IOError("Too many read 0")
IOError: Too many read 0
(line 1839 from the posted link)
Here is the offending section (halfway down):
def _reader(self):
threading.currentThread().setName(self.identifier + 'reader')
if __debug__: self._log(1, 'starting using select')
line_part = ''
rxzero = 0
terminate = False
while not (terminate or self.Terminate):
if self.state == LOGOUT:
timeout = 1
else:
timeout = self.read_poll_timeout
try:
r,w,e = select.select([self.read_fd], [], [], timeout)
if __debug__: self._log(5, 'select => %s, %s, %s' % (r,w,e))
if not r: # Timeout
continue
data = self.read(self.read_size) # Drain ssl buffer if present
start = 0
dlen = len(data)
if __debug__: self._log(5, 'rcvd %s' % dlen)
if dlen == 0:
rxzero += 1
if rxzero > 5:
raise IOError("Too many read 0") # <- This is the error I'm
time.sleep(0.1) # getting
else:
rxzero = 0
while True:
stop = data.find('\n', start)
if stop < 0:
line_part += data[start:]
break
stop += 1
line_part, start, line = \
'', stop, line_part + data[start:stop]
if __debug__: self._log(4, '< %s' % line)
self.inq.put(line)
if self.TerminateReader:
terminate = True
except:
reason = 'socket error: %s - %s' % sys.exc_info()[:2]
if __debug__:
if not self.Terminate:
self._print_log()
if self.debug: self.debug += 4 # Output all
self._log(1, reason)
self.inq.put((self.abort, reason))
break
I can't catch this error from my script because imaplib2 creates separate threads for its _reader and _writer functions. I don't really understand the error, so my question is should I modify the imaplib2 source code to ignore this error or change the conditions of it or what?
Thanks
I've been getting a variety of errors from imaplib2 including errno 10054 connection forciblely closed and too many read 0. These errors would cause my program to hang for about a half hour. To work around these issues I used multiprocessing to do the work in a separate process and implemented an activity check. If there is no activity for a period of time the main process terminates (I know, not ideal) the child process and spawns another. Here is some of the relevant code.
def MasterRun():
from multiprocessing import Value
counter = Value("I", 0)
last = counter.value
elapsed = 0
interval = 1
TIMEOUT = 90
proc = _make_process(counter)
# value < 0 signals process quit naturally
while counter.value >= 0:
if counter.value != last:
elapsed = 0
last = counter.value
if elapsed >= TIMEOUT or not proc.is_alive():
print "terminating process reason: %s\n" % \
("no activity time was exceeded" if proc.is_alive() else "process committed suicide")
proc.terminate()
proc.join(25)
proc = _make_process(counter)
proc.start()
elapsed = 0
sleep(interval)
elapsed += interval
proc.join(25)
def _make_process(counter):
from multiprocessing import Process
print "spawning child process"
proc = Process(target=_make_instance, args=(counter, ))
proc.daemon = True
proc.start()
return proc
This should be simple, but I'm just not seeing it.
If I have a process ID, how can I use that to grab info about the process such as the process name.
Under Linux, you can read proc filesystem. File /proc/<pid>/cmdline contains the commandline.
Try PSUtil -> https://github.com/giampaolo/psutil
Works fine on Windows and Unix, I recall.
For Windows
A Way to get all the pids of programs on your computer without downloading any modules:
import os
pids = []
a = os.popen("tasklist").readlines()
for x in a:
try:
pids.append(int(x[29:34]))
except:
pass
for each in pids:
print(each)
If you just wanted one program or all programs with the same name and you wanted to kill the process or something:
import os, sys, win32api
tasklistrl = os.popen("tasklist").readlines()
tasklistr = os.popen("tasklist").read()
print(tasklistr)
def kill(process):
process_exists_forsure = False
gotpid = False
for examine in tasklistrl:
if process == examine[0:len(process)]:
process_exists_forsure = True
if process_exists_forsure:
print("That process exists.")
else:
print("That process does not exist.")
raw_input()
sys.exit()
for getpid in tasklistrl:
if process == getpid[0:len(process)]:
pid = int(getpid[29:34])
gotpid = True
try:
handle = win32api.OpenProcess(1, False, pid)
win32api.TerminateProcess(handle, 0)
win32api.CloseHandle(handle)
print("Successfully killed process %s on pid %d." % (getpid[0:len(prompt)], pid))
except win32api.error as err:
print(err)
raw_input()
sys.exit()
if not gotpid:
print("Could not get process pid.")
raw_input()
sys.exit()
raw_input()
sys.exit()
prompt = raw_input("Which process would you like to kill? ")
kill(prompt)
That was just a paste of my process kill program I could make it a whole lot better but it is okay.
Using psutil, here is the simplest code i can give you:
import psutil
# The PID ID of the process needed
pid_id = 1216
# Informations of the Process with the PID ID
process_pid = psutil.Process(pid_id)
print(process_pid)
# Gives You PID ID, name and started date
# psutil.Process(pid=1216, name='ATKOSD2.exe', started='21:38:05')
# Name of the process
process_name = process_pid.name()
Try this
def filter_non_printable(str):
ret=""
for c in str:
if ord(c) > 31 or ord(c) == 9:
ret += c
else:
ret += " "
return ret
#
# Get /proc/<cpu>/cmdline information
#
def pid_name(self, pid):
try:
with open(os.path.join('/proc/', pid, 'cmdline'), 'r') as pidfile:
return filter_non_printable(pidfile.readline())
except Exception:
pass
return
I wrote this simple Munin plugin to graph average fan speed and I want to redo it to OOP - strictly as a learning exercise. Don't have a clue where to start though. Anyone feel like offering some guidance or even an example of what this script should look like when done. I will use it to redo some other scripts into an OOP style as well; again for learning purposes.
import sys
import subprocess
CMD = "/usr/sbin/omreport chassis fans".split()
# Munin populates sys.argv[1] with "" (an empty argument), lets remove it.
sys.argv = [x for x in sys.argv if x]
if len(sys.argv) > 1:
if sys.argv[1].lower() == "autoconfig":
print "autoconfig"
elif sys.argv[1].lower() == "config":
print "graph_title Average Fan Speed"
print "graph_args --base 1000 -l 0"
print "graph_vlabel speed (RPM)"
print "graph_category Chassis"
print "graph_info This graph shows the average speed of all fans"
print "graph_period second"
print "speed.label speed"
print "speed.info Average fan speed for the five minutes."
else:
try:
data = subprocess.Popen(CMD,stdout=subprocess.PIPE).stdout.readlines()
except OSError, e:
print >> sys.stderr, "Error running '%s', %s" % (" ".join(cmd), e)
sys.exit(1)
count = total = 0
for item in data:
if "Reading" in item:
# Extract variable length fan speed, without regex.
total += int(item.split(":")[1].split()[0])
count += 1
# Sometimes omreport returns zero output if omsa services aren't started.
if not count or not total:
print >> sys.stderr, 'Error: "omreport chassis fans" returned 0 output.'
print >> sys.stderr, 'OMSA running? Try: "srvadmin-services.sh status".'
sys.exit(1)
avg = (total / count)
print "speed.value %s" % avg
You remake it in OOP by identifying code and data that goes together. These you then merge into "classes".
You actual data above seems to be the output of a process.
The code is iterating over it. I guess you can make a class out of that if you want to, but it's a bit silly. :)
So, something like this (obviously completely untested code):
import sys
import subprocess
class Fanspeed(object):
def __init__(self, command):
self.command = command.split()
def average_fan_speed(self):
data = subprocess.Popen(CMD,stdout=subprocess.PIPE).stdout.readlines()
count = total = 0
for item in data:
if "Reading" in item:
# Extract variable length fan speed, without regex.
total += int(item.split(":")[1].split()[0])
count += 1
# Sometimes omreport returns zero output if omsa services aren't started.
if not count or not total:
raise ValueError("I found no fans. Is OMSA services started?"
avg = (total / count)
return % avg
if __main__ == '__main__':
# Munin populates sys.argv[1] with "" (an empty argument), lets remove it.
sys.argv = [x for x in sys.argv if x]
if len(sys.argv) > 1:
if sys.argv[1].lower() == "autoconfig":
print "autoconfig"
elif sys.argv[1].lower() == "config":
print "graph_title Average Fan Speed"
print "graph_args --base 1000 -l 0"
print "graph_vlabel speed (RPM)"
print "graph_category Chassis"
print "graph_info This graph shows the average speed of all fans"
print "graph_period second"
print "speed.label speed"
print "speed.info Average fan speed for the five minutes."
else:
try:
cmd = "/usr/sbin/omreport chassis fans"
fanspeed = Fanspeed(cmd)
average = fanspeed.average_fan_speed()
except OSError, e:
print >> sys.stderr, "Error running '%s', %s" % (cmd, e)
sys.exit(1)
except ValueError, e:
# Sometimes omreport returns zero output if omsa services aren't started.
print >> sys.stderr, 'Error: "omreport chassis fans" returned 0 output.'
print >> sys.stderr, 'OMSA running? Try: "srvadmin-services.sh status".'
sys.exit(1)
But YMMV. It's perhaps a bit clearer.