Python performance profiling (file close)

Python performance profiling (file close) - python

First of all thanks for your attention. My question is how to reduce the execution time of my code.
Here is the relevant code. The below code is called in iteration from the main.
def call_prism(prism_input_file,random_length):
prism_output_file = "path.txt"
cmd = "prism %s -simpath %d %s" % (prism_input_file,random_length,prism_output_file)
p = os.popen(cmd)
p.close()
return prism_output_file
def main(prism_input_file, number_of_strings):
...
for n in range(number_of_strings):
prism_output_file = call_prism(prism_input_file,z[n])
...
return
I used statistics from the "profile statistics browser" when I profiled my code. The "file close" system command took the maximum time (14.546 seconds). The call_prism routine is called 10 times. But the number_of_strings is usually in thousands, so, my program takes lot of time to complete.
Let me know if you need more information. By the way I tried with subprocess, too. Thanks.

Thanks for your feedback on my question. Based on the comments that others provided I did a parallel version of my code, and the performance of the code indeed improved. Here is the snippet of the parallel version. Your feedback, if any, is welcome.
def call_prism(prism_input_file,random_length):
...
cmd = "prism %s -simpath %d stdout" % (prism_input_file,random_length)
args = shlex.split(cmd)
p = subprocess.Popen(args,stdout=subprocess.PIPE)
p.poll()
prism_output_lines = p.stdout.readlines()
...
return ...
def call_prism_star(prism_input_file_random_length):
return call_prism(*prism_input_file_random_length)
def main(prism_input_file, number_of_strings,number_of_threads):
pool = Pool(processes=number_of_threads)
for n in range(0,number_of_strings,number_of_threads):
...
for i in range(number_of_threads):
a_args.append(...)
output = pool.map(call_prism_star,itertools.izip(itertools.repeat(prism_input_file),a_args))
...
return

Related

Stop multiprocess pool when a condition is met and continue with program

I've been trying to wrap my head around multiprocessing using an old python bitcoin mining program. Although relatively useless for mining, I figured this would be a great way to explore multiprocessing. However, I've hit a wall when it comes to stopping the processes when one of them achieves the goal they are all working towards.
I want to kill all multiprocessing pools when one of them finds the solution. Then allow the program to continue. I have tried terminate() and join(). I've attempted to include an Event(). I've tried using Process instead of Pool with the direction of a similar issue here: Killing a multiprocessing process when condition is met. However, same problem. How can I stop all processes after a condition is met without exiting the program with something like sys.exit() that would kill the entire program?
I tried also apply_sync with the direction from this post: Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done? However, it did not solve the problem of needing to continue executing the final functions of the program. In fact, it actually slowed the program significantly.
For clarity, I've included the code I tried based on the above mentioned link here:
from multiprocessing import Pool
from hashlib import sha256
import time
def SHA256(text):
return sha256(text.encode("ascii")).hexdigest()
def solution_helper(args):
solution, nonce = do_job(args)
if solution:
print(f"\nNonce Found: {nonce}\n")
return True
else:
return False
class Mining():
def __init__(self, workers, initargs):
self.pool = Pool(processes=workers, initargs=initargs)
def callback(self, result):
if result:
print('Solution Found...Terminating Processes...')
self.pool.terminate()
def do_job(self):
for args in values:
start_nonce = args[0]
end_nonce = args[1]
prefix_str = '0'*difficulty
self.pool.apply_async(solution_helper, args=args, callback=self.callback)
start = time.time()
for nonce in range(start_nonce, end_nonce):
text = str(block_number) + transactions + previous_hash + str(nonce)
new_hash = SHA256(text)
if new_hash.startswith(prefix_str):
print(f"Hashing: {text}")
print(f"\nSuccessfully mined bitcoin with nonce value: {nonce}\n")
print(f"New hash: {new_hash}")
total_time = str((time.time()-start))
print(f"\nEnd mning... Mining took {total_time} seconds\n")
return new_hash, nonce
self.pool.close()
self.pool.join()
print('.Goodbye.')
block_number = 5
transactions = """
bill->steve->20,
jan->phillis->45
"""
previous_hash = '0000000b7c7723e4d3a8654c975fe4dd23d4d37f22d0ea7e5abde2225d1567dc6'
values = [(20000, 100000), (100000, 1000000), (1000000, 10000000), (10000000, 100000000)]
difficulty = 4
m = Mining(5, values)
m.do_job()
Here's the basic concept. It works great to start the processes, but I cannot figure out how to stop them:
from multiprocessing import Pool
from hashlib import sha256
import functools
MAX_NONCE = 1000000000
def SHA256(text):
return sha256(text.encode("ascii")).hexdigest()
def nonce(block_number, transactions, previous_hash, prefix_str):
import time
start = time.time()
for nonce in range(MAX_NONCE):
text = str(block_number) + transactions + previous_hash + str(nonce)
new_hash = SHA256(text)
if new_hash.startswith(prefix_str):
print(f"\nYay! Successfully mined bitcoins with nonce value:{nonce}")
total_time = str((time.time()-start))
print(f"\nend mining. Mining took: {total_time} seconds\n")
print(new_hash + "\n")
def mine(block_number, transactions, previous_hash, prefix_zeros):
from multiprocessing import Pool
with Pool(4) as p:
prefix_str = '0'*prefix_zeros
p.map(nonce(block_number, transactions, previous_hash, prefix_str), [20000, 40000, 60000, 80000, 100000])
if __name__=='__main__':
transactions="""
bill->steve->20,
jan->phillis->45
"""
difficulty=7
print("\nstart mining\n")
new_hash = mine(5, transactions, '0000000b7c7723e4d3a8654c975fe4dd23d4d37f22d0ea7e5abde2225d1567dc6', difficulty)
# Do some other things... Here is where I'd like to get to after the multiproccesses are killed
print(f"\nMission Complete...{new_hash}\n") <---This never gets a chance to happen

Python data generation script slows down with time

EDIT 1:
As fizzybear pointed out it looks as though my memory usage is steadily increasing but I can't say why, any ideas would be greatly appreciated.
I'm running a script which uses the staticfg library to generate a tonne of control flow graphs from python programs, approximately 150,000 programs. My code simply loops through every program's file location and generates a corresponding control flow graph.
From a frequently updated progress bar I can see that when the script begins running it easily generates around 1000 CFGs in a few seconds, but half an hour into running it can barely generate 100 CFGs within a minute.
In an attempt to sped things up I implemented multi threading using python's multiprocessing map() function but this doesn't help enough.
Furthermore, the cpu utilization (for all cores) shoots up to around 80-90% at the beginning of the script but drops to around 30-40% after running for a few minutes.
I've tried running it on Windows 10 and Ubuntu 18.04 and both slow down to an almost unbearable speed.
Code for building control-flow-graph
from staticfg import CFGBuilder
def process_set():
content = get_file_paths()
iterate(build_cfg, ERROR_LOG_FILE, content)
def build_cfg(file_path):
cfg = CFGBuilder().build_from_file(os.path.basename(file_path), os.path.join(DATA_PATH, file_path))
cfg.build_visual(get_output_data_path(file_path), format='dot', calls=False, show=False)
os.remove(get_output_data_path(file_path)) # Delete the other weird file created
Code for running the cfg building
from threading import Lock
from multiprocessing.dummy import Pool as ThreadPool
import multiprocessing
def iterate(task, error_file_path, content):
progress_bar = ProgressBar(0, content.__len__(), prefix='Progress:', suffix='Complete')
progress_bar.print_progress_bar()
error_file_lock = Lock()
increment_work_lock = Lock()
increment_errors_lock = Lock()
def an_iteration(file):
try:
task(file)
except Exception as e:
with increment_errors_lock:
progress_bar.increment_errors()
with error_file_lock:
handle_exception(error_file_path, file, 'Error in doing thing', e)
finally:
with increment_work_lock:
progress_bar.increment_work()
progress_bar.print_progress_bar()
pool = multiprocessing.dummy.Pool(multiprocessing.cpu_count())
pool.map(an_iteration, content)
Code for error handling
def handle_exception(error_log_file_path, file_path, message, stacktrace):
with open(error_log_file_path, 'a+', encoding='utf8') as f:
f.write('\r{},{},{},{}\n'.format(str(datetime.datetime.now()), message, file_path, stacktrace))
As far as I can tell (?) there is no object ever increasing in size and no increasing lookup time somewhere, so I'm a little lost as to why the script should be slowing down at all. Any help would be greatly appreciated.
I'm also pretty sure that it's not the contention for the locks that is slowing down the program as I was having this problem before I implemented multi threading, and contention should be pretty low anyway because the CFG building should take up a lot more time than updating the progress bar. Furthermore, errors aren't that frequent so writing to the error log doesn't happen too often, not enough to justify a lot of contention.
Cheers.
Edit 2:
Code for progress bar in case that affects the memory usage
class ProgressBar:
def __init__(self, iteration, total, prefix='', suffix='', decimals=1, length=100, fill='█'):
self.iteration = iteration
self.total = total
self.prefix = prefix
self.suffix = suffix
self.decimals = decimals
self.length = length
self.fill = fill
self.errors = 0
def increment_work(self):
self.iteration += 1
def increment_errors(self):
self.errors += 1
def print_progress_bar(self):
percent = ("{0:." + str(self.decimals) + "f}").format(100 * (self.iteration / float(self.total)))
filled_length = int(self.length * self.iteration // self.total)
bar = self.fill * filled_length + '-' * (self.length - filled_length)
print('%s |%s| %s%% (%s/%s) %s, %s %s' % (self.prefix, bar, percent, self.iteration, self.total, self.suffix, str(self.errors), 'errors'), end='\r')
# Print New Line on Complete
if self.iteration == self.total:
print()

disorder in multiprocessing subprocess

After running some compuations nicely in linear fashion with a moderator script (cf. below) calling an inner one performing the computation, I struggle
to bring it to execution when trying it with multiprocessing. It seems that each CPU core is running through this list set (testRegister) and launches a computation even if an other core already performed this task earlier (in the same session). How can I prevent this chaotic behaviour? It is my first time attempting calling multiple processors by Python.
Correction: The initial post did not show that the test is a string consisting calling "the inner script" with varying parameters m1 and m2 beside fixed arguments arg1 and arg2 belonging solely to this "inner script".
#!/usr/bin/env python3
import os
import subprocess as sub
import sys
import multiprocessing
fileRegister = []
testRegister = []
def fileCollector():
for file in os.listdir("."):
if file.endswith(".xyz"):
fileRegister.append(file)
fileRegister.sort()
return fileRegister
def testSetup():
data = fileRegister
while len(data) > 1:
for entry in fileRegister[1:]:
m0 = str(fileRegister[0])
m1 = str(entry)
test = str("python foo.py ") + str(m1) + str(" ") + str(m2) +\
str(" --arg1 --arg2") # formulate test condition
testRegister.append(test)
testRegister.sort()
del data[0]
return testRegister
def shortAnalysator():
for entry in testRegister:
print(str(entry))
sub.call(entry, shell=True)
del testRegister[0]
def polyAnalysator():
# apparently each CPU core works as if the register were not shared
# reference: https://docs.python.org/3.7/library/multiprocessing.html
if __name__ == '__main__':
jobs = []
for i in range(3): # safety marging to not consume all CPU
p = multiprocessing.Process(target=shortAnalysator)
jobs.append(p)
p.start()
fileCollector()
testSetup()
shortAnalysator() # proceeding expectably on one CPU (slow)
# polyAnalysator() # causing irritation
sys.exit()```

Your polyAnalysator is running the shortAnalysator three times. Try changing your polyAnalysator as follows, and add the f method. This uses the multiprocessing Pool:
from multiprocessing import Pool
def f(test):
sub.call(test, shell=True)
def polyAnalysator():
# apparently each CPU core works as if the register were not shared
# reference: https://docs.python.org/3.7/library/multiprocessing.html
with Pool(3) as p:
p.map(f, testRegister)

How do I gather performance metrics for GDI and user Objects using python

Think this is my first question I have asked on here normally find all the answers I need (so thanks in advance)
ok my problem I have written a python program that will in threads monitor a process and output the results to a csv file for later. This code is working great I am using win32pdhutil for the counters and WMI, Win32_PerfRawData_PerfProc_Process for the CPU %time. I have now been asked to monitor a WPF application and specifically monitor User objects and GDI objects.
This is where I have a problem, it is that i can't seem to find any python support for gathering metrics on these two counters. these two counters are easily available in the task manager I find it odd that there is very little information on these two counters. I am specifically looking at gathering these to see if we have a memory leak, I don't want to install anything else on the system other than python that is already installed. Please can you peeps help with finding a solution.
I am using python 3.3.1, this will be running on a windows platform (mainly win7 and win8)
This is the code i am using to gather the data
def gatherIt(self,whoIt,whatIt,type,wiggle,process_info2):
#this is the data gathering function thing
data=0.0
data1="wobble"
if type=="counter":
#gather data according to the attibutes
try:
data = win32pdhutil.FindPerformanceAttributesByName(whoIt, counter=whatIt)
except:
#a problem occoured with process not being there not being there....
data1="N/A"
elif type=="cpu":
try:
process_info={}#used in the gather CPU bassed on service
for x in range(2):
for procP in wiggle.Win32_PerfRawData_PerfProc_Process(name=whoIt):
n1 = int(procP.PercentProcessorTime)
d1 = int(procP.Timestamp_Sys100NS)
#need to get the process id to change per cpu look...
n0, d0 = process_info.get (whoIt, (0, 0))
try:
percent_processor_time = (float (n1 - n0) / float (d1 - d0)) *100.0
#print whoIt, percent_processor_time
except ZeroDivisionError:
percent_processor_time = 0.0
# pass back the n0 and d0
process_info[whoIt] = (n1, d1)
#end for loop (this should take into account multiple cpu's)
# end for range to allow for a current cpu time rather that cpu percent over sampleint
if percent_processor_time==0.0:
data=0.0
else:
data=percent_processor_time
except:
data1="N/A"
else:
#we have done something wrong so data =0
data1="N/A"
#endif
if data == "[]":
data=0.0
data1="N/A"
if data == "" :
data=0.0
data1="N/A"
if data == " ":
data=0.0
data1="N/A"
if data1!="wobble" and data==0.0:
#we have not got the result we were expecting so add a n/a
data=data1
return data
cheers
edited for correct cpu timings issue if anyone tried to run it :D

so after a long search i was able to mash something together that gets me the info needed.
import time
from ctypes import *
from ctypes.wintypes import *
import win32pdh
# with help from here http://coding.derkeiler.com/Archive/Python/comp.lang.python/2007-10/msg00717.html
# the following has been mashed together to get the info needed
def GetProcessID(name):
object = "Process"
items, instances = win32pdh.EnumObjectItems(None, None, object, win32pdh.PERF_DETAIL_WIZARD)
val = None
if name in instances :
tenQuery = win32pdh.OpenQuery()
tenarray = [ ]
item = "ID Process"
path = win32pdh.MakeCounterPath( ( None, object, name, None, 0, item ) )
tenarray.append( win32pdh.AddCounter( tenQuery, path ) )
win32pdh.CollectQueryData( tenQuery )
time.sleep( 0.01 )
win32pdh.CollectQueryData( tenQuery )
for tencounter in tenarray:
type, val = win32pdh.GetFormattedCounterValue( tencounter, win32pdh.PDH_FMT_LONG )
win32pdh.RemoveCounter( tencounter )
win32pdh.CloseQuery( tenQuery )
return val
processIDs = GetProcessID('OUTLOOK') # Remember this is case sensitive
PQI = 0x400
#open a handle on to the process so that we can query it
OpenProcessHandle = windll.kernel32.OpenProcess(PQI, 0, processIDs)
# OK so now we have opened the process now we want to query it
GR_GDIOBJECTS, GR_USEROBJECTS = 0, 1
print(windll.user32.GetGuiResources(OpenProcessHandle, GR_GDIOBJECTS))
print(windll.user32.GetGuiResources(OpenProcessHandle, GR_USEROBJECTS))
#so we have what we want we now close the process handle
windll.kernel32.CloseHandle(OpenProcessHandle)
hope that helps

For GDI count, I think a simpler, cleaner monitoring script is as follows:
import time, psutil
from ctypes import *
def getPID(processName):
for proc in psutil.process_iter():
try:
if processName.lower() in proc.name().lower():
return proc.pid
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
pass
return None;
def getGDIcount(PID):
PH = windll.kernel32.OpenProcess(0x400, 0, PID)
GDIcount = windll.user32.GetGuiResources(PH, 0)
windll.kernel32.CloseHandle(PH)
return GDIcount
PID = getPID('Outlook')
while True:
GDIcount = getGDIcount(PID)
print(f"{time.ctime()}, {GDIcount}")
time.sleep(1)

benchmarking python script reveals mysterious time delays

i have two modules: moduleParent and moduleChild
i'm doing something like this in moduleParent:
import moduleChild
#a bunch of code
start = time.time()
moduleChild.childFunction()
finish = time.time()
print "calling child function takes:", finish-start, "total seconds"
#a bunch of code
i'm doing something like this in moduleChild:
def childFunction():
start = time.time()
#a bunch of code
finish = time.time()
print "child function says it takes:", finish-start, "total seconds"
the output looks like this:
calling child function takes: .24 total seconds
child function says it takes: 0.0 total seconds
so my question is, where are these .24 extra seconds coming from?
thank you for your expertise.
#
here is the actual code for "childFuntion". it really shouldn't take .24 seconds.
def getResources(show, resourceName='', resourceType=''):
'''
get a list of resources with the given name
#show: show name
#resourceName: name of resource
#resourceType: type of resource
#return: list of resource dictionaries
'''
t1 = time.time()
cmd = r'C:\tester.exe -cmdFile "C:\%s\info.txt" -user root -pwd root'%show
cmd += " -cmd findResources -machineFormatted "
if resourceName:
cmd += '-name %s'%resourceName
if resourceType:
cmd += '_' + resourceType.replace(".", "_") + "_"
proc=subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output = proc.stdout.read()
output = output.strip()
resourceData = output.split("\r\n")
resourceData = resourceData[1:]
resourceList = []
for data in resourceData:
resourceId, resourceName, resourceType = data.split("|")
rTyp = "_" + resourceType.replace(".", "_") + "_"
shot, assetName = resourceName.split(rTyp)
resourceName = assetName
path = '//projects/%s/scenes/%s/%s/%s'%(show, shot, resourceType.replace(".", "/"), assetName)
resourceDict = {'id':resourceId, 'name':resourceName, 'type':resourceType, 'path':path }
resourceList.append(resourceDict)
t2 = time.time()
print (" ", t2 - t2, "seconds")
return resourceList

Edit 2: I just noticed a typo in child function, you have t2 - t2 in the print statement
ignore below:
Calling the function itself has overhead (setting up stack space, saving local variables, returning, etc). The result suggests that your function is so trivial that setting up for a function call took longer than running the code itself.
Edit: also, calling the timers as well as print ads overhead. Now that I think about it, calling print could account for a lot of that .24 seconds. IO is slow.

You can't measure the time of a function by running it once, especially one which runs so short. There are a myriad of factors which could affect the timing, not the least of which is what other processes you have running on the system.
Something like this I would probably run at least a few hundred times. Check out the timeit module.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python performance profiling (file close) - python

Related

Stop multiprocess pool when a condition is met and continue with program

Python data generation script slows down with time

disorder in multiprocessing subprocess

How do I gather performance metrics for GDI and user Objects using python

benchmarking python script reveals mysterious time delays

Categories

Resources