As said in the title, I want to overwrite a function/method with the content of a string.
Background
I want to use an extern library which handles a CNN-model. The problem is, that some of these models contain functions, which my library cant handle. So I investigated the structure of the model and found out that some layers contain operators like "+=", which my extern library can't handle, so I have to replace these lines every time I found them.
To realize this, I used the inspect module to generate the a string, which contains the function with the problematic lines.
Out of this string a generate an abstract syntax tree with the ast module.
In this structure I could change the function like I needed it.
Now I can parse this abstract syntax tree to a string if needed, but the problem is that a lot of techniques to transform this string into a function, so that I can overwrite the problematic functions with my generated code in the string, but I failed every time.
This is my code in the moment, and sorry for the bad style, I'm a neewby to that :(
import torchvision
import torch
import inspect
import textwrap
import ast
from pprint import pprint
import torch.nn as nn
# This is a model which i have to investigate
model = torchvision.models.resnet18()
# Here i extract all classes and functions to check them below
classes = [
x[0] for x in inspect.getmembers(nn, inspect.isclass)
]
# This class changed my abstract syntax tree into the form i need
class Trasformer(ast.NodeTransformer):
def visit_AugAssign(self,node):
return (ast.Assign(targets=[ast.Name(id='out', ctx=ast.Store())], value=ast.Call(func=ast.Attribute(value=ast.Name(id='torch', ctx=ast.Load()), attr='stack', ctx=ast.Load()), args=[ast.List(elts=[ast.Name(id='identity', ctx=ast.Load()), ast.Name(id='out', ctx=ast.Load())], ctx=ast.Load())], keywords=[ast.keyword(arg='dim', value=ast.UnaryOp(op=ast.USub(), operand=ast.Constant(value=1)))])), ast.Assign(targets=[ast.Name(id='out', ctx=ast.Store())], value=ast.Call(func=ast.Attribute(value=ast.Name(id='self', ctx=ast.Load()), attr='canonizer_sum', ctx=ast.Load()), args=[ast.Name(id='out', ctx=ast.Load())], keywords=[])))
# This function checks, if there
def recursive(model):
children = list(model.children())
print(f"Checking {model.__class__.__name__}\t{len(children)}")
if len(children) == 0:
# print(model)
pass
else:
if not model.__class__.__name__ in classes:
# print(model)
print(f"!!{model.__class__.__name__}")
print(textwrap.dedent(inspect.getsource(model.forward)))
"""Hier wird der Code der Funktion verändert"""
forward_code = textwrap.dedent(inspect.getsource(model.forward))
tree = ast.parse(forward_code)
#print(ast.dump(tree))
# Using the Transformer class, to visit all nodes in the abstract syntax tree, and if the node is "AugAssign (+=) then the node will changed into two nodes, with the correct statements
Trasformer().visit(tree)
# delivers the needed information of the new nodes, after the change
tree = ast.fix_missing_locations(tree)
# print(ast.dump(tree, indent=2))
# Here the changed AST is compiled into string
forward_code = ast.unparse(tree)
"""Up to this point everything works fine. But now the problems start."""
#At this point i tried different ways to overwrite the function "model.forward" which the function, hold in the string. But nothing worked. This is my last attempt
model.forward. = exec(compile(tree, filename='test', mode='exec'))
# here the first line of code is cut off, (the function definition, because just the code inside should be overwritten
# forward_code = forward_code.split("\n", 1)[1]
print("new string:\n")
#print(forward_code)
print(textwrap.dedent(inspect.getsource(model.forward)))
# TODO: Transform the string into the method -> i.e. overwrite the method is needed"""
# setattr(model,"forward", forward_code)
#print("new:\n")
#print(textwrap.dedent(inspect.getsource(model.forward)))
for module in children:
recursive(module)
recursive(model)
quit()
So my question is, is it possible to overwrite a function/method with the content of a string?
I had different ideas:
Overwrite it directly --> doesn't work because you can't assign a string to a function
Write the string into a file and load the content of the file into the function
I haven't tried it yet, because I thought that there has to be a better way.
use exec/eval/compile. I think this way it maybe could work but I have no idea how.
I tried it with the function head in the string and without in it
Could somebody please help me and tell me how I overwrite the function with a corrected ast tree or the string generated out of it?
Is there a module/ function from the python library which I could use?
Thanks in advance for all volunteer helpers :)
Python functions are immutable, like many other Python objects. Once created, you cannot modify a function. But you can normally replace it with a different function, by setting the attribute of the class or object from which the function was extracted.
By the way, if you have the AST of a function, there is no need to turn it into a string in order to compile it. You can just compile the AST itself.
Related
The python docs and greentreesnakes show it's possible to use ast.NodeTransformer to change python code via ASTs:
I want to change assignments like a=1 to a=variables(1).
I'm getting a bit confused over what ctx is and when it's needed, the output of astor doesn't seem to have it, yet it seems to be needed.
you do not necessarily need the NodeTransformer to do this. If you already have the AST and know the node (an Assign node in this case) you can modify in-place as below.
import ast
src = 'a = 1'
code = 'variables(1)'
tree = ast.parse(src)
assign_node = tree.body[0]
assign_node.value = ast.parse(code).body[0].value
The updated AST can be used further for whatever is your purpose. For example, you can generate the source code back (using say the astor module as astor.to_source(tree) in the above example). NodeTransformer is handy if you need to find the node given some condition. It has a similar visit_* functions like NodeVisitor that allows you to do this in place. Like below:
from ast import parse, NodeTransformer
class ChangeAssignment(NodeTransformer):
def visit_Assign(self, node):
if assign_node.targets[0].id == 'a':
node.value = parse('variables(1)').body[0].value
return node
src = 'a = 1'
t = ChangeAssignment()
tree = parse(src)
tree = t.visit(tree)
The ctx for variable determines if it is for loading, storing or deleting a variable. It is required if you create the node directly (above I use parse to create nodes implicitly). For modifying the AST or generating the source back (like astor.to_source) it does not matter what the ctx you use. If you, however, directly pass the AST to Python compile function, it will complain if the correct ctx is not used.
I'm new to python and I'm tring to make a class for a modul which checking curses in texts.
can someone help please?
import urllib
class Checktext:
def __init__(self, text):
self.text = text
def gettext(self):
file = open(self.text, "r")
filetext = open.read()
for word in filetext.split():
openurl = urllib.request.urlopen("http://www.wdylike.appspot.com/?q=" + word)
output = openurl.read()
truer = "true" in str(output)
print(truer)
s = Checktext(r"C:\Users\Tzach\.atom\Test\Training\readme.txt")
Checktext.gettext()
You declared s as a new Checktext object, so you need to call s.gettext() not an un-instantiated Checktext.gettext(), as that has no self to refer to
The urllib is a package. You have to import the module request that is located in the package:
import urllib.request
The open(filename) return a file object. You want to call the method of that object:
filetext = file.read()
And as G. Anderson wrote, you want to call s.gettext() instead of Checktext.gettext(). The self inside is actually equal to the s outside. If you want to be weird then you actually can use also:
Checktext.gettext(s)
Notice the s passed as your missing parameter. Here Python actually reveals how the Object Oriented things are implemented internally. In majority of OO languages, it is carefully hidden, but calling a method of an object is always internally translated as passing one more special argument that points to the instance of the class, that is the object. When defining a Python method, that special argument is explicitly named self (by convention; you can name it differently -- you can try as the lecture, but you should always keep that convention).
Thinking about it thoroughly, you can get the key idea of the hidden magic of an OO language syntax. The instance of the class (the object) is actually only a portion of memory that stores the data part, and that is passed to the functions that implement the methods. The Checktext.gettext is actually the function, the s is the object. The s.gettext() is actually only a different way to express exactly the same. AS s is the instance of the Checktext class, the fact is stored inside the s. Therefore, the s.gettext() creates the illusion that the rigth code will be called magically. It fits with the trained brain better than the function approach if the s is thought as a tangible something.
I am trying to build basic equation functionality into python-docx to output formulas to docx files. Can someone go over the standard-operating procedure for registering a new class in oxml? Looking at the source code, a tag appears to be declared by creating a complex-type class
class CT_P(BaseOxmlElement):
"""
''<w:p>'' element, containing the properties and text for a paragraph.
"""
pPr = ZeroOrOne('w:pPr')
r = ZeroOrMore('w:r')
and then registering it using the register_element_cls() function
from .text.paragraph import CT_P
register_element_cls('w:p', CT_P)
Some classes include other methods, but many do not, so it looks like a minimum working example would be this:
from docx import Document
from docx.oxml.xmlchemy import BaseOxmlElement, ZeroOrOne, ZeroOrMore, OxmlElement
import docx.oxml
docx.oxml.ns.nsmap['m'] = ('http://schemas.openxmlformats.org/officeDocument/2006/math')
class CT_OMathPara(BaseOxmlElement):
r = ZeroOrMore('w:r')
docx.oxml.register_element_cls('m:oMathPara',CT_OMathPara)
p = CT_OMathPara()
(Note that I have to declare the m namespace, since it is not used in the package). Unfortunately this doesn't work for me at all. If I declare a new class derived as in the above example, and then check, for instance, the __repr__ of this new class, it causes an exception
>> p
File "C:\ProgramData\Anaconda3\lib\site-packages\docx\oxml\ns.py", line 50, in from_clark_name
nsuri, local_name = clark_name[1:].split('}')
ValueError: not enough values to unpack (expected 2, got 1)
This happens because the tag in my class is very different from a w:p tag created from the python-docx package
>> paragraph._element.tag
'{http://schemas.openxmlformats.org/wordprocessingml/2006/main}p'
>> p.tag
'CT_OMathPara'
But I don't know why this is. A file search through the source code does not reveal any other mentions of the CT_P class, so I'm a bit stumped.
I think the error is coming from the 'm' namespace prefix (nspfx) not being present in the docx.oxml.ns.pfxmap dict. The namespace needs to be looked up both ways (from nspfx to namespace url and from url to nspfx).
So to add the new namespace from "outside", meaning after the ns module is loaded, you'll need to do both (if you were to patch the ns module code directly, that second step would be handled automatically at load time):
nsmap, pfxmap = docx.oxml.ns.nsmap, docx.oxml.ns.pfxmap
nsmap['m'] = 'http://schemas.openxmlformats.org/officeDocument/2006/math'
pfxmap['http://schemas.openxmlformats.org/officeDocument/2006/math'] = 'm'
This should get you to past the error you're getting, however there's a little more to understand.
The CT_OMathPara class is an example of what's known as a custom element class. This means that lxml instantiates an object of this class for each element with the registered tag (m:oMathPara) instead of the generic lxml _Element class.
The key thing is, you need to let lxml do the constructing, which happens when it parses the XML. You can't get a meaningful object by constructing that class yourself.
The easiest way to create a new "loose" element (not lodged in an XML document tree) is to use docx.oxml.OxmlElement():
oMathPara = OxmlElement('m:oMathPara')
More commonly though, the docx.oxml.parse_xml() function is used to parse an entire XML snippet. A parser needs to be configured to use custom element classes and those elements have to be registered with the parser, so you probably don't want to do that for yourself when the one in the oxml module will take care of all the needful.
So usually, to get an instance of CT_OMathPara, you would just open a docx that contained a m:oMathPara element (after registering the new namespace and custom element class), but you can also just parse in an XML snippet. If you search for parse_xml in the oxml modules you'll find plenty of examples. You need to get the namespace declarations right at the top of the XML you provide, which can be a little tricky, but you can certainly just spell out the entire XML snippet in text if you want, it just gets a bit verbose.
I found the following code snippet that I can't seem to make work for my scenario (or any scenario at all):
def load(code):
# Delete all local variables
globals()['code'] = code
del locals()['code']
# Run the code
exec(globals()['code'])
# Delete any global variables we've added
del globals()['load']
del globals()['code']
# Copy k so we can use it
if 'k' in locals():
globals()['k'] = locals()['k']
del locals()['k']
# Copy the rest of the variables
for k in locals().keys():
globals()[k] = locals()[k]
I created a file called "dynamic_module" and put this code in it, which I then used to try to execute the following code which is a placeholder for some dynamically created string I would like to execute.
import random
import datetime
class MyClass(object):
def main(self, a, b):
r = random.Random(datetime.datetime.now().microsecond)
a = r.randint(a, b)
return a
Then I tried executing the following:
import dynamic_module
dynamic_module.load(code_string)
return_value = dynamic_module.MyClass().main(1,100)
When this runs it should return a random number between 1 and 100. However, I can't seem to get the initial snippet I found to work for even the simplest of code strings. I think part of my confusion in doing this is that I may misunderstand how globals and locals work and therefore how to properly fix the problems I'm encountering. I need the code string to use its own imports and variables and not have access to the ones where it is being run from, which is the reason I am going through this somewhat over-complicated method.
You should not be using the code you found. It is has several big problems, not least that most of it doesn't actually do anything (locals() is a proxy, deleting from it has no effect on the actual locals, it puts any code you execute in the same shared globals, etc.)
Use the accepted answer in that post instead; recast as a function that becomes:
import sys, imp
def load_module_from_string(code, name='dynamic_module')
module = imp.new_module(name)
exec(code, mymodule.__dict__)
return module
then just use that:
dynamic_module = load_module_from_string(code_string)
return_value = dynamic_module.MyClass().main(1, 100)
The function produces a new, clean module object.
In general, this is not how you should dynamically import and use external modules. You should be using __import__ within your function to do this. Here's a simple example that worked for me:
plt = __import__('matplotlib.pyplot', fromlist = ['plt'])
plt.plot(np.arange(5), np.arange(5))
plt.show()
I imagine that for your specific application (loading from code string) it would be much easier to save the dynamically generated code string to a file (in a folder containing an __init__.py file) and then to call it using __import__. Then you could access all variables and functions of the code as parts of the imported module.
Unless I'm missing something?
I'm writing an interpreter for an old in-game scripting language, and so need to compile dictionary that has the name of the command from the language matched up against the symbol for that function.
Now, I've already figured out here: How to call a function based on list entry?
...That you can call functions this way, and I know that you can use dir to get a list of strings of all functions in a module. I've been able to get this list, and using a regex, removed the built-in commands and anything else I don't actually want the script to be able to call. The goal is to sandbox here. :)
Now that I have the list of items that are defined in the module, I need to get the symbol for each definition.
For a more visual representation, this is the test module I want to get the symbol for:
def notify(stack,mufenv):
print stack[-1]
It's pulled in via an init script, and I am able to get the notify function's name in a list using:
import mufprims
import re
moddefs=dir(mufprims)
primsfilter=re.compile('__.+__')
primslist=[ 'mufprims.' + x for x in dir(mufprims) if not primsfilter.match(x) ]
print primslist
This returns:
['mufprims.notify']
...which is the exact name of the function I wish to find the symbol for.
I read over http://docs.python.org/library/symtable.html here, but I'm not sure I understand it. I think this is the key to what I want, but I didn't see an example that I could understand. Any ideas how I would get the symbol for the functions I've pulled from the list?
You want to get the function from the mufprims module by using getattr and the function name. Like so:
primslist=[getattr(mufprims, x) for x in dir(mufprims) if not primsfilter.match(x) ]
I thought I might add another possible suggestion for retrieving the functions of an object:
import inspect
# example using os.path
import os.path
results = inspect.getmembers(os.path, inspect.isroutine)
print results
# truncated result
[...,
('splitdrive', <function splitdrive at 0x1002bcb18>),
('splitext', <function splitext at 0x1002bcb90>),
('walk', <function walk at 0x1002bda28>)]
Using dir on the object would essentially give you every member of that object, including non-callable attributes, etc. You could use the inspect module to get a more controlled return type.