Python Runtime Profiler?

Python Runtime Profiler? - python

Most python profilers are made for python programs or scripts, in my case I'm working with a python plugin for a third-party app (blender 3d), therefore the profiling needs to be sampled in real-time while the user is interacting with the plugin.
I'm currently trying an injection strategy, which consists of procedurally searching through all plugin modules, & injecting a profiler wrapper to every single function.
see below, this is what my current profiler looks like
I'm wondering if there are other profilers out there that can be used for run-time scenarios such as plugins
class ModuleProfiler:
#profiler is running?
allow = False #must be True in order to start the profiler
activated = False #read only indication if profiler has been activated
#please define your plugin main module here
plugin_main_module = "MyBlenderPlugin"
#function calls registry
registry = {}
#ignore parameters, typically ui functions/modules
ignore_fcts = [
"draw",
"foo",
]
ignore_module = [
"interface_drawing",
]
event_prints = True #print all event?
#classmethod
def print_registry(cls):
"""print all registered benchmark"""
#generate total
for k,v in cls.registry.copy().items():
cls.registry[k]["averagetime"] = v["runtime"]/v["calls"]
print("")
print("PROFILER: PRINTING OUTCOME")
sorted_registry = dict(sorted(cls.registry.items(), key=lambda item:item[1]["runtime"], reverse=False))
for k,v in sorted_registry.items():
print("\n",k,":")
for a,val in v.items():
print(" "*6,a,":",val)
return None
#classmethod
def update_registry(cls, fct, exec_time=0):
"""update internal benchmark with new data"""
key = f"{fct.__module__}.{fct.__name__}"
r = cls.registry.get(key)
if (r is None):
cls.registry[key] = {}
cls.registry[key]["calls"] = 0
cls.registry[key]["runtime"] = 0
r = cls.registry[key]
r["calls"] +=1
r["runtime"] += exec_time
return None
#classmethod
def profile_wrap(cls, fct):
"""wrap any functions with our benchmark & call-counter"""
#ignore some function?
if (fct.__name__ in cls.ignore_fcts):
return fct
import functools
import time
#functools.wraps(fct)
def inner(*args,**kwargs):
t = time.time()
r = fct(*args,**kwargs)
cls.update_registry(fct, exec_time=time.time()-t)
if cls.event_prints:
print(f"PROFILER : {fct.__module__}.{fct.__name__} : {time.time()-t}")
return r
return inner
#classmethod
def start(cls):
"""inject the wrapper for every functions of every sub-modules of our plugin
used for benchmark or debugging purpose only"""
if (not cls.allow):
return None
cls.activated = True
import types
import sys
def is_function(obj):
"""check if given object is a function"""
return isinstance(obj, types.FunctionType)
print("")
#for all modules in sys.modules
for mod_k,mod in sys.modules.copy().items():
#separate module componments names
mod_list = mod_k.split('.')
#fileter what isn't ours
if (mod_list[0]!=cls.plugin_main_module):
continue
#ignore some modules?
if any([m in cls.ignore_module for m in mod_list]):
continue
print("PROFILER_SEARCH : ",mod_k)
#for each objects found in module
for ele_k,ele in mod.__dict__.items():
#if it does not have a name, skip
if (not hasattr(ele,"__name__")):
continue
#we have a global function
elif is_function(ele):
print(f" INJECT LOCAL_FUNCTION: {mod_k}.{ele_k}")
mod.__dict__[ele_k] = cls.profile_wrap(ele)
#then we have a homebrewed class? search for class.fcts
#class.fcts implementation is not flawless, need to investigate issue(s)
elif repr(ele).startswith(f"<class '{cls.plugin_main_module}."):
for class_k,class_e in ele.__dict__.items():
if is_function(class_e):
print(f" INJECT CLASS_FUNCTION: {mod_k}.{ele_k}.{class_k}")
setattr( mod.__dict__[ele_k], class_k, cls.profile_wrap(class_e),) #class.__dict__ are mapping proxies, need to assign this way,
continue
print("")
return None
ModuleProfiler.allow = True
ModuleProfiler.plugin_main_module = "MyModule"
ModuleProfiler.start()

Related

Define multiple versions of the same function name

Is it possible to somehow to have 2 functions with the same name, but only one of the gets defined.
Something like:
version='revA'
def RevA():
if (version=='revA'):
return lambda x: x
else:
return lambda x: None
def RevB():
if (version=='revB'):
return lambda x: x
else:
return lambda x: None
#RevA
def main():
print("RevA")
#RevB
def main():
print("RevB")
main()

How about classes and inheritance:
class Base:
def main(self):
print("base")
class RevA(Base):
def main(self):
print("RevA")
class RevB(Base):
def main(self):
print("RevB")
if version == 'revA':
obj = RevA()
elif version == 'revB:
obj = RevB()
else:
obj = Base()
obj.main()
Also typical are factory functions like:
def get_obj(version, *args, **kwargs):
omap = { 'revA': revA, 'revB': revB }
return omap[version](*args, **kwargs)
This allows you to call for example:
obj = get_obj('revA', 23, fish='burbot')
Which will be equivalent to:
if version == 'revA':
obj = revA(23, fish='burbot')

You can, but doing literally that would be very uncommon:
if version == 'revA':
def main():
print("RevA")
elif version == 'revB':
def main():
print("RevB")
main()
More usually, you'd define both functions then choose which one to use by assigning it to a variable:
def main_A():
print("RevA")
def main_B():
print("RevB")
# select the right version using a dispatch table
main = {
'RevA': main_A,
'RevB': main_B,
}[version]
main()
Variants of this latter approach are quite common; both web applications and graphical applications often work this way, with a table mapping URLs or user actions to functions to be called. Often the table is maintained by the framework and your code adds entries to it in multiple places in the code, sometimes in bulk (eg Django), sometimes one by one (eg Flask).
Having both functions defined (not just the selected one) means that you can also call each version directly; that's useful if the main program uses a dispatch table but various subsidiary code (such as the tests) needs to call a particular one of the functions

Trace specific functions in Python to capture high-level execution flow

Python offers tracing through its trace module. There are also custom solutions like this. But these approaches capture most low-level executions, inside-and-out of most/every library you use. Other than deep-dive debugging this isn't very useful.
It would be nice to have something that captures only the highest-level functions laid out in your pipeline. For example, if I had:
def funct1():
res = funct2()
print(res)
def funct2():
factor = 3
res = funct3(factor)
return(res)
def funct3(factor):
res = 1 + 100*factor
return(res)
...and called:
funct1()
...it would be nice to capture:
function order:
- funct1
- funct2
- funct3
I have looked at:
trace
tracefunc
sys.settrace
trace.py
I am happy to manually mark the functions inside the scripts, like we do with Docstrings. Is there a way to add "hooks" to functions, then track them as they get called?

You can always use a decorator to track which functions are called. Here is an example that allows you to keep track of what nesting level the function is called at:
class Tracker:
level = 0
def __init__(self, indent=2):
self.indent = indent
def __call__(self, fn):
def wrapper(*args, **kwargs):
print(' '*(self.indent * self.level) + '-' + fn.__name__)
self.level += 1
out = fn(*args, **kwargs)
self.level -= 1
return out
return wrapper
track = Tracker()
#track
def funct1():
res = funct2()
print(res)
#track
def funct2():
factor = 3
res = funct3(factor)
return(res)
#track
def funct3(factor):
res = 1 + 100*factor
return(res)
It uses the class variable level to keep track of how many functions have been called and simply prints out the the function name with a space indent. So calling funct1 gives:
funct1()
# prints:
-funct1
-funct2
-funct3
# returns:
301
Depending on how you want to save the output, you can use the logging module for the output

Most efficient way to handle big number of constants

I am writing a program that, depending on a certain values from an Excel table, makes an API call. There are 2 conditions from the table that will be checked:
Language
Provider
Depending on those two values a different set of constants is needed for the API call:
def run_workflow(provider, language, workflow):
if provider == 'xxxx' and language == 0:
wf_ready = provider_ready
wf_unverified = provider_unverified
wf_active = provider_active
wf_another = provider_another
wf_closed = provider_closed
wf_wrongid = provider_wrongid
elif provider == 'yyyy' and language == 0:
wf_ready = provider_ready
wf_unverified = provider_unverified
wf_active = provider_active
wf_another = provider_another
wf_closed = provider_closed
wf_wrongid = provider_wrongid
elif ...
if workflow == 'ready':
response = requests.post(API + wf_ready),headers=header, data=json.dumps(conversation))
elif workflow == 'unverified':
response = requests.post(API + wf_unverified),headers=header, data=json.dumps(conversation))
elif ...
There are 2 provider and 7 different languages and I am trying to figure out the most efficient (and Pythonic way) to handle this scenario and came up with creating a class for each language:
class Workflow_Language():
def english(self):
self.provider_unverified = 1112
self.provider_ready = 1113
self.provider_active = 1114
self.provider_vip = 1115
def russian(self):
self.provider_unverified = 1116
self.provider_ready = 1117
self.provider_active = 1118
self.provider_vip = 1119
def ...
...
Is there maybe a better way to handle this?

One way is to map constants to appropriate handlers:
class LanguageData:
def __init__(self, unverified, ready, active, vip):
self.unverified = unverified
self.ready = ready
self.active = active
self.vip = vip
def english():
return LanguageData(1,2,3,4)
def russian():
return LanguageData(5,6,7,8)
LANGUAGE_MAP = {'en': english, 'ru': russian}
I've made up 'en', 'ru' values for clarity. It seems that 0 is in your case? Also note that english and russian are standalone functions. Finally the LanguageData class is not mandatory, you can simply return a dictionary from those functions. But workin with attributes instead of string keys seems easier to maintain.
And then in the code:
def run_workflow(provider, language, workflow):
lang_data = LANGUAGE_MAP[language]()
if workflow == 'ready':
url = API + data.ready
elif workflow == 'unverified':
url = API + data.unverified
response = requests.post(url, headers=header, data=json.dumps(conversation))
Of course workflow can be wrapped in a similar way if there are more than 2 possible values.
Analogously for provider. Unless the action depends on both provider and language at the same time in which case you need a double map:
LANG_PROV_MAP = {
('en', 'xxxx'): first,
('ru', 'yyyy'): second,
}
def run_workflow(provider, language, workflow):
data = LANG_PROV_MAP[(provider, language)]()
...
The original code can be simplified with a tricky decorator:
LANGUAGE_MAP = {}
def language_handler(lang):
def wrapper(fn):
LANGUAGE_MAP[lang] = fn
return fn
return wrapper
#language_handler('en')
def handler():
return LanguageData(1,2,3,4)
#language_handler('ru')
def handler():
return LanguageData(5,6,7,8)
Also note that if the data is "constant" (i.e. doesn't depend on the context) then you can completely omit callables to make everything even simplier:
LANGUAGE_MAP = {
'en': LanguageData(1,2,3,4),
'ru': LanguageData(5,6,7,8),
}
def run_workflow(provider, language, workflow):
data = LANGUAGE_MAP[language]
...

The combination of the language and provider can compose the method name and the call will be invoked dynamically.
Example:
import sys
def provider1_lang2():
pass
def provider2_lang4():
pass
# get the provider / lang and call the method dynamically
provider = 'provider2'
lang = 'lang4'
method_name = '{}_{}'.format(provider,lang)
method = getattr(sys.modules[__name__], method_name)
method()

Maya MPxNode multiple outputs

I'm trying to create an MPxNode with multiple outputs, but I can only get one to work properly. The other output doesn't set properly after connecting the node and during undos.
Is it possible to set both outputs at the same time in compute like how I'm trying to? It does work if I change the first line in compute to if plug != self.output1 and plug != self.output2, but that means it would calculate twice which is a waste of memory. And you could imagine how bad this would be if there were even more outputs.
I managed to minimize the code to this simple example. I'm scripting it in Python on Maya 2018:
import maya.OpenMayaMPx as OpenMayaMPx
import maya.OpenMaya as OpenMaya
class MyAwesomeNode(OpenMayaMPx.MPxNode):
# Define node properties.
kname = "myAwesomeNode"
kplugin_id = OpenMaya.MTypeId(0x90000005)
# Define node attributes.
in_val = OpenMaya.MObject()
output1 = OpenMaya.MObject()
output2 = OpenMaya.MObject()
def __init__(self):
OpenMayaMPx.MPxNode.__init__(self)
def compute(self, plug, data):
# Only operate on output1 attribute.
if plug != self.output1:
return OpenMaya.kUnknownParameter
# Get input value.
val = data.inputValue(MyAwesomeNode.in_val).asFloat()
# Set output 2.
# This fails when setting up the node and during undos.
out_plug_2 = data.outputValue(self.output2)
if val > 0:
out_plug_2.setFloat(1)
else:
out_plug_2.setFloat(0)
out_plug_2.setClean()
# Set output 1.
# This works as expected.
out_plug_1 = data.outputValue(self.output1)
out_plug_1.setFloat(val)
out_plug_1.setClean()
data.setClean(plug)
return True
def creator():
return OpenMayaMPx.asMPxPtr(MyAwesomeNode())
def initialize():
nattr = OpenMaya.MFnNumericAttribute()
MyAwesomeNode.output2 = nattr.create("output2", "output2", OpenMaya.MFnNumericData.kFloat)
nattr.setWritable(False)
nattr.setStorable(False)
MyAwesomeNode.addAttribute(MyAwesomeNode.output2)
MyAwesomeNode.output1 = nattr.create("output1", "output1", OpenMaya.MFnNumericData.kFloat)
nattr.setWritable(False)
nattr.setStorable(False)
MyAwesomeNode.addAttribute(MyAwesomeNode.output1)
MyAwesomeNode.in_val = nattr.create("input", "input", OpenMaya.MFnNumericData.kFloat, 1)
nattr.setKeyable(True)
MyAwesomeNode.addAttribute(MyAwesomeNode.in_val)
MyAwesomeNode.attributeAffects(MyAwesomeNode.in_val, MyAwesomeNode.output2)
MyAwesomeNode.attributeAffects(MyAwesomeNode.in_val, MyAwesomeNode.output1)
def initializePlugin(obj):
plugin = OpenMayaMPx.MFnPlugin(obj, "Me", "1.0", "Any")
try:
plugin.registerNode(MyAwesomeNode.kname, MyAwesomeNode.kplugin_id, creator, initialize)
except:
raise RuntimeError, "Failed to register node: '{}'".format(MyAwesomeNode.kname)
def uninitializePlugin(obj):
plugin = OpenMayaMPx.MFnPlugin(obj)
try:
plugin.deregisterNode(MyAwesomeNode.kplugin_id)
except:
raise RuntimeError, "Failed to register node: '{}'".format(MyAwesomeNode.kname)
# Example usage of node
if __name__ == "__main__":
import maya.cmds as cmds
cmds.createNode("transform", name="result")
cmds.setAttr("result.displayLocalAxis", True)
cmds.createNode("myAwesomeNode", name="myAwesomeNode")
cmds.connectAttr("myAwesomeNode.output1", "result.translateX")
# This output fails.
cmds.polyCube(name="cube")
cmds.setAttr("cube.translate", 0, 3, 0)
cmds.connectAttr("myAwesomeNode.output2", "cube.scaleX")
cmds.connectAttr("myAwesomeNode.output2", "cube.scaleY")
cmds.connectAttr("myAwesomeNode.output2", "cube.scaleZ")

I have a solution which is working as expected. All outputs still must go through compute() but only one output will do the actual heavy calculations.
When going through compute, it checks all output plugs if any are clean. If all are dirty, then we need to re-calculate, otherwise if we do find one clean plug we can just use the cached values we saved earlier.
Here's an example:
import maya.OpenMayaMPx as OpenMayaMPx
import maya.OpenMaya as OpenMaya
class MyAwesomeNode(OpenMayaMPx.MPxNode):
# Define node properties.
kname = "myAwesomeNode"
kplugin_id = OpenMaya.MTypeId(0x90000005)
# Define node attributes.
in_val = OpenMaya.MObject()
output1 = OpenMaya.MObject()
output2 = OpenMaya.MObject()
def __init__(self):
OpenMayaMPx.MPxNode.__init__(self)
# Store value here.
self.cached_value = 0
def compute(self, plug, data):
# Include all outputs here.
if plug != self.output1 and plug != self.output2:
return OpenMaya.kUnknownParameter
# Get plugs.
val = data.inputValue(MyAwesomeNode.in_val).asFloat()
out_plug_1 = data.outputValue(self.output1)
out_plug_2 = data.outputValue(self.output2)
dep_node = OpenMaya.MFnDependencyNode(self.thisMObject())
# Determine if this output needs to recalculate or simply use cached values.
use_cache_values = False
for name in ["output1", "output2"]:
mplug = dep_node.findPlug(name)
if data.isClean(mplug):
# If we find a clean plug then just use cached values.
use_cache_values = True
break
if use_cache_values:
# Use cached value.
value = self.cached_value
else:
# Calculate value.
# We potentially can make big computations here.
self.cached_value = val
value = val
# Set output 1.
if plug == self.output1:
out_plug_1.setFloat(value)
out_plug_1.setClean()
# Set output 2.
if plug == self.output2:
if value > 0:
out_plug_2.setFloat(1)
else:
out_plug_2.setFloat(0)
out_plug_2.setClean()
data.setClean(plug)
return True
def creator():
return OpenMayaMPx.asMPxPtr(MyAwesomeNode())
def initialize():
nattr = OpenMaya.MFnNumericAttribute()
MyAwesomeNode.output2 = nattr.create("output2", "output2", OpenMaya.MFnNumericData.kFloat)
nattr.setWritable(False)
nattr.setStorable(False)
MyAwesomeNode.addAttribute(MyAwesomeNode.output2)
MyAwesomeNode.output1 = nattr.create("output1", "output1", OpenMaya.MFnNumericData.kFloat)
nattr.setWritable(False)
nattr.setStorable(False)
MyAwesomeNode.addAttribute(MyAwesomeNode.output1)
MyAwesomeNode.in_val = nattr.create("input", "input", OpenMaya.MFnNumericData.kFloat, -1)
nattr.setKeyable(True)
MyAwesomeNode.addAttribute(MyAwesomeNode.in_val)
# Include both outputs.
MyAwesomeNode.attributeAffects(MyAwesomeNode.in_val, MyAwesomeNode.output1)
MyAwesomeNode.attributeAffects(MyAwesomeNode.in_val, MyAwesomeNode.output2)
def initializePlugin(obj):
plugin = OpenMayaMPx.MFnPlugin(obj, "Me", "1.0", "Any")
try:
plugin.registerNode(MyAwesomeNode.kname, MyAwesomeNode.kplugin_id, creator, initialize)
except:
raise RuntimeError, "Failed to register node: '{}'".format(MyAwesomeNode.kname)
def uninitializePlugin(obj):
plugin = OpenMayaMPx.MFnPlugin(obj)
try:
plugin.deregisterNode(MyAwesomeNode.kplugin_id)
except:
raise RuntimeError, "Failed to register node: '{}'".format(MyAwesomeNode.kname)
# Example usage of node
if __name__ == "__main__":
import maya.cmds as cmds
cmds.createNode("transform", name="result")
cmds.setAttr("result.displayLocalAxis", True)
cmds.createNode("myAwesomeNode", name="myAwesomeNode")
cmds.connectAttr("myAwesomeNode.output1", "result.translateX")
# This output fails.
cmds.polyCube(name="cube")
cmds.setAttr("cube.translate", 0, 3, 0)
cmds.connectAttr("myAwesomeNode.output2", "cube.scaleX")
cmds.connectAttr("myAwesomeNode.output2", "cube.scaleY")
cmds.connectAttr("myAwesomeNode.output2", "cube.scaleZ")
The outputs seem to be reacting ok when re-opening the file, when importing it to a new scene, and referencing it. I just need to transfer the same idea to c++ then it'll be golden.

Static classes being initialised on import. How does python 2 initialise static classes on import

I am trying to introduce python 3 support for the package mime and the code is doing something I have never seen before.
There is a class Types() that is used in the package as a static class.
class Types(with_metaclass(ItemMeta, object)): # I changed this for 2-3 compatibility
type_variants = defaultdict(list)
extension_index = defaultdict(list)
# __metaclass__ = ItemMeta # unnessecary now
def __init__(self, data_version=None):
self.data_version = data_version
The type_variants defaultdict is what is getting filled in python 2 but not in 3.
It very much seems to be getting filled by this class when is in a different file called mime_types.py.
class MIMETypes(object):
_types = Types(VERSION)
def __repr__(self):
return '<MIMETypes version:%s>' % VERSION
#classmethod
def load_from_file(cls, type_file):
data = open(type_file).read()
data = data.split('\n')
mime_types = Types()
for index, line in enumerate(data):
item = line.strip()
if not item:
continue
try:
ret = TEXT_FORMAT_RE.match(item).groups()
except Exception as e:
__parsing_error(type_file, index, line, e)
(unregistered, obsolete, platform, mediatype, subtype, extensions,
encoding, urls, docs, comment) = ret
if mediatype is None:
if comment is None:
__parsing_error(type_file, index, line, RuntimeError)
continue
extensions = extensions and extensions.split(',') or []
urls = urls and urls.split(',') or []
mime_type = Type('%s/%s' % (mediatype, subtype))
mime_type.extensions = extensions
...
mime_type.url = urls
mime_types.add(mime_type) # instance of Type() is being filled?
return mime_types
The function startup() is being run whenever mime_types.py is imported and it does this.
def startup():
global STARTUP
if STARTUP:
type_files = glob(join(DIR, 'types', '*'))
type_files.sort()
for type_file in type_files:
MIMETypes.load_from_file(type_file) # class method is filling Types?
STARTUP = False
This all seems pretty weird to me. The MIMETypes class first creates an instance of Types() on the first line. _types = Types(VERSION). It then seems to do nothing with this instance and only use the mime_types instance created in the load_from_file() class method. mime_types = Types().
This sort of thing vaguely reminds me of javascript class construction. How is the instance mime_types filling Types.type_variants so that when it is imported like this.
from mime import Type, Types
The class's type_variants defaultdict can be used. And why isn't this working in python 3?
EDIT:
Adding extra code to show how type_variants is filled
(In "Types" Class)
#classmethod
def add_type_variant(cls, mime_type):
cls.type_veriants[mime_type.simplified].append(mime_type)
#classmethod
def add(cls, *types):
for mime_type in types:
if isinstance(mime_type, Types):
cls.add(*mime_type.defined_types())
else:
mts = cls.type_veriants.get(mime_type.simplified)
if mts and mime_type in mts:
Warning('Type %s already registered as a variant of %s.',
mime_type, mime_type.simplified)
cls.add_type_variant(mime_type)
cls.index_extensions(mime_type)
You can see that MIMETypes uses the add() classmethod.

Without posting more of your code, it's hard to say. I will say that I was able to get that package ported to Python 3 with only a few changes (print statement -> function, basestring -> str, adding a dot before same-package imports, and a really ugly hack to compensate for their love of cmp:
def cmp(x,y):
if isinstance(x, Type): return x.__cmp__(y)
if isinstance(y, Type): return y.__cmp__(x) * -1
return 0 if x == y else (1 if x > y else -1)
Note, I'm not even sure this is correct.
Then
import mime
print(mime.Types.type_veriants) # sic
printed out a 1590 entry defaultdict.
Regarding your question about MIMETypes._types not being used, I agree, it's not.
Regarding your question about how the dictionary is being populated, it's quite simple, and you've identified most of it.
import mime
Imports the package's __init__.py which contains the line:
from .mime_types import MIMETypes, VERSION
And mime_types.py includes the lines:
def startup():
global STARTUP
if STARTUP:
type_files = glob(join(DIR, 'types', '*'))
type_files.sort()
for type_file in type_files:
MIMETypes.load_from_file(type_file)
STARTUP = False
startup()
And MIMETypes.load_from_file() has the lines:
mime_types = Types()
#...
for ... in ...:
mime_types.add(mime_type)
And Types.add(): has the line:
cls.add_type_variant(mime_type)
And that classmethod contains:
cls.type_veriants[mime_type.simplified].append(mime_type)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Runtime Profiler? - python

Related

Define multiple versions of the same function name

Trace specific functions in Python to capture high-level execution flow

Most efficient way to handle big number of constants

Maya MPxNode multiple outputs

Static classes being initialised on import. How does python 2 initialise static classes on import

Categories

Resources