limited eval function failing - how to import selected modules - python

I want to attach functional stubs to a data process job I've written, and it would be convenient to be able to apply these via config file.
I can load and run these by means of the eval function, but want to be able to control the available namespace "sandbox" in which the evaluated functions can operate, so I can avoid malicious code injection.
In the python docs, it suggests blocking off __builtins__ and then populating either (or is it both? it's not clear) globals and locals as dictionaries containing the objects in the execution namespace.
When I do this, the code I had been running successfully stops working.
I think this is because one of my test lambdas is referencing functions normally imported from the datetime module - but it's not clear to me how to get these to successfully attach to the namespace.
from datetime import datetime
now = datetime.now()
lambdas = { "Report Date" : "lambda x : now.strftime(\"%d/%m/%Y\")",
"Scary Test" : "lambda x : os.curdir " }
compiled_funcs = {k:eval(v) for k,v in lambdas.items()}
compiled_funcs ['Report Date'](1)
>>> '15/04/2019'
compiled_funcs ['Scary Test'](1)
>>> '.'
Now, I want to edit the eval() function to limit the available scope so that the datetime function continues to work, but the os module fails (if I can call an os command, then I could do something scary like wipe the disk, or worse)
I have tried constructions like:
compiled_funcs = {k:eval(v,{'__builtins__':None, "now" : now, "datetime" : datetime, } , { }) for k,v in lambdas.items()}
But when I do this, I get the following error:
AttributeError: 'NoneType' object has no attribute '__import__'
Which suggests that somewhere/somehow, the function I want to apply is trying to call/import some content down the line - and (presumably) this is correctly being blocked by having borked the __builtins__ content. Is there a way to pre-package such functions and inject them into the eval globals, or locals dictionaries to enable a pre-defined set of functional tools?
Once I've got this working, I should be able to extend it so I can curate my own subset of safe function calls to be exposed to this run-time collection from configuration files.
N.B. I know in the above, I could define my lambdæ with no arguments - but in my real code, it would be normal to feed a single parameter, so have built my test code accordingly.

Related

Forward function signature in VSCode

Say I have a function in src/f1.py with the following signature:
def my_func(a1: int, a2: bool) -> float:
...
In a separate file src/f2.py I create a dictionary:
from src.f1 import my_func
my_dict = {
"func": my_func
}
In my final file src/test1.py I call the function:
from src.f2 import my_dict
print(my_dict["func"](1, True))
I'm getting IDE auto-suggestions for my_func in src/f2.py, but not in src/test1.py. I have tried using typing.Callable, but it doesn't create the same signature and it loses the function documentation. Is there any way I can get these in src/test1.py without changing the structure of my code?
I don't want to change the files in which my functions, dictionaries, or tests are declared.
I use VSCode version 1.73.1 and Python version 3.8.13. I cannot change my Python version.
I tried creating different types of Callable objects, but had problems getting the desired results.
They seem to have no docstring support
Some types are optional. I couldn't get that to work.
They do not work with variable names, only data types. I want the variable names (argument names to the function) to be there in the IDE suggestion.
What am I trying to do really?
I am trying to implement a mechanism where a user can set the configurations for a library in a single file. That single file is where all the dictionaries are stored (and it imports all essential functions).
This "configuration dictionary" is called in the main python file (or wherever needed). I have functions in a set of files for accomplishing a specific set of tasks.
Say functions fa1 and fa2 for task A; fb1, fb2, and fb3 for task B; and fc1 for task C. I want configurations (choices) for task A, followed by B, then C. So I do
work_ABC = {"A": fa1, "B": fb2, "C": fc1};
Now, I have a function like
wd = work_ABC
def do_work_abc(i, do_B=True):
res_a = wd["A"](i)
res_b = res_a
if do_B:
res_b = wd["B"](res_a)
res_c = wd["C"](res_b)
return res_c
If you have a more efficient way how I can implement the same thing, I'd love to hear it.
I want IntelliSense to give me the function signature of the function set for the dictionary.
There is no type annotation construct in Python that covers docstrings or parameter names of functions. There isn't even one for positional-only or keyword-only parameters (which would be actually meaningful in a type sense).
As I already mentioned in my comment, docstrings and names are not type-related. Ultimately, this is an IDE issue. PyCharm for example has no problem inferring those details with the setup you provided. I get the auto-suggestion for my_dict["func"] with the parameter names because PyCharm is smart/heavy enough to track it to the source. But it has its limits. If I change the code in f2 to this:
from src.f1 import my_func
_other_dict = {
"func": my_func
}
my_dict = {}
my_dict.update(_other_dict)
Then the suggestion engine is lost.
The reason is simply the discrepancy between runtime and static analysis. At some point it becomes silly/unreasonable to expect a static analysis tool to essentially run the code for you. This is why I always say:
Static type checkers don't execute your code, they just read it.
Even the fact that PyCharm "knows" the signature of my_func with your setup entails it running some non-trivial code in the background to back-track from the dictionary key to the dictionary to the actual function definition.
So in short: It appears you are out of luck with VSCode. And parameter names and docstrings are not part of the type system.

Show whether a Python module is loaded from bytecode

I'm trying to debug Hy's use of bytecode. In particular, each time a module is imported, I want to see the path it was actually imported from, whether source or bytecode. Under the hood, Hy manages modules with importlib. It doesn't explicitly read or write bytecode; that's taken care of by importlib.machinery.SourceFileLoader. So it looks like what I want to do is monkey-patch Python's importing system to print the import path each time an import happens. How can I do that? I should be able to figure out how to do it for Hy once I understand how to do it for Python.
The easiest way that does not involve coding, is to start Python with two(!) verbose flags:
python -vv myscript.py
you'll get a lot of output, including all the import statements and all the files Python tries to access in order to load the module. In this example I have a simple python script that does import json:
lots of output!
[...]
# trying /tmp/json.cpython-310-x86_64-linux-gnu.so
# trying /tmp/json.abi3.so
# trying /tmp/json.so
# trying /tmp/json.py
# trying /tmp/json.pyc
# /usr/lib/python3.10/json/__pycache__/__init__.cpython-310.pyc matches /usr/lib/python3.10/json/__init__.py
# code object from '/usr/lib/python3.10/json/__pycache__/__init__.cpython-310.pyc'
[...]
Alternatively but more complex: you could change the import statement itself. For that, you can overwrite __import__, which is invoked by the import statement itself. This way you could print out all the files import actually opens.
Seems like a good option would be to dynamically patch importlib.machinery.SourceFileLoader(fullname, path) and importlib.machinery.SourcelessFileLoader(fullname, path) to each print or write to a variable (a) the calling method and (b) the argument passed to the function.
If all you need to do is:
I want to see the path it was actually imported from, whether source or bytecode
And you don't need the import to "work properly", perhaps you can do a modified version of something like this. For example, I quickly modified their sample code to get this, I have not tested it so it may not work exactly, but it should get you on the right track:
# custom class to be the mock return value
class MockSourceLoader:
# mock SourceFileLoader method always returns that the module was loaded from source and its path
def SourceFileLoader(fullname, path):
return {"load type": "source", "fullname": fullname, "path": path}
def check_how_imported(monkeypatch):
# Any arguments may be passed and mock_get() will always return our
# mocked object
def mock_get(*args, **kwargs):
return MockSourceLoader
# apply the monkeypatch
monkeypatch.setattr(importlib.machinery, SourceFileLoader, SourceFileLoader)
You would of course provide a similar mock for Sourceless file loading for SourcelessFileLoader
For reference:
https://docs.python.org/3/library/importlib.html#:~:text=importlib.machinery.SourceFileLoader(fullname%2C%20path)%C2%B6
https://docs.python.org/3/library/importlib.html#:~:text=importlib.machinery.SourcelessFileLoader(fullname%2C%20path)

How to call variables from two different python modules(bi-directional)

I am stuck in resolving a problem using python. Problem is I have to pass a variable value of module(python_code1.py) to a different module(python_code2.py). Based on the variable value, need to do some calculation in the module python_code2.py and then need to capture output value in the same module(python_code1.py) for further calculations.
Below is the snapshot of my code logic :
python_code2.py
import python_code1
data = python_code1.json_data
'''
Lines of code
'''
output = "some variable attribues"
python_code1.py
import python_code2
json_data = {"val1": "abc3","val1": "abc3","val1": "abc3"}
input_data = python_code2.output
''''
Lines of code using input_data variable
'''''
when I execute python python_code1.py, this is giving error:
AttributeError: module 'python_code2' has no attribute 'output'
I feel like I am not doing it in write way, but considering my code complexity and lines of code, I have to use these 2 module method.
Putting your code at the top-level is fine for quick throw away scripts, but that's about all. The proper way to organize your code for anything non-trivial is to define functions, so you can pass values as arguments and get results as return value.
If there's only on script using those functions, you can keep them in the script itself.
If you have multiple scripts needing to use the same functions, move those functions to a module and import this module from your scripts.

How to extend built-in classes in python

I'm writing some code for an esp8266 micro controller using micro-python and it has some different class as well as some additional methods in the standard built in classes. To allow me to debug on my desktop I've built some helper classes so that the code will run. However I've run into a snag with micro-pythons time function which has a time.sleep_ms method since the standard time.sleep method on micropython does not accept floats. I tried using the following code to extend the built in time class but it fails to import properly. Any thoughts?
class time(time):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def sleep_ms(self, ms):
super().sleep(ms/1000)
This code exists in a file time.py. Secondly I know I'll have issues with having to import time.time that I would like to fix. I also realize I could call this something else and put traps for it in my micro controller code however I would like to avoid any special functions in what's loaded into the controller to save space and cycles.
You're not trying to override a class, you're trying to monkey-patch a module.
First off, if your module is named time.py, it will never be loaded in preference to the built-in time module. Truly built-in (as in compiled into the interpreter core, not just C extension modules that ship with CPython) modules are special, they are always loaded without checking sys.path, so you can't even attempt to shadow the time module, even if you wanted to (you generally don't, and doing so is incredibly ugly). In this case, the built-in time module shadows you; you can't import your module under the plain name time at all, because the built-in will be found without even looking at sys.path.
Secondly, assuming you use a different name and import it for the sole purpose of monkey-patching time (or do something terrible like adding the monkey patch to a custom sitecustomize module, it's not trivial to make the function truly native to the monkey-patched module (defining it in any normal way gives it a scope of the module where it was defined, not the same scope as other functions from the time module). If you don't need it to be "truly" defined as part of time, the simplest approach is just:
import time
def sleep_ms(ms):
return time.sleep(ms / 1000)
time.sleep_ms = sleep_ms
Of course, as mentioned, sleep_ms is still part of your module, and carries your module's scope around with it (that's why you do time.sleep, not just sleep; you could do from time import sleep to avoid qualifying it, but it's still a local alias that might not match time.sleep if someone else monkey-patches time.sleep later).
If you want to make it behave like it's part of the time module, so you can reference arbitrary things in time's namespace without qualification and always see the current function in time, you need to use eval to compile your code in time's scope:
import time
# Compile a string of the function's source to a code object that's not
# attached to any scope at all
# The filename argument is garbage, it's just for exception traceback
# reporting and the like
code = compile('def sleep_ms(ms): sleep(ms / 1000)', 'time.py', 'exec')
# eval the compiled code with a scope of the globals of the time module
# which both binds it to the time module's scope, and inserts the newly
# defined function directly into the time module's globals without
# defining it in your own module at all
eval(code, vars(time))
del code, time # May as well leave your monkey-patch module completely empty

Get symbol for def in module in Python

I'm writing an interpreter for an old in-game scripting language, and so need to compile dictionary that has the name of the command from the language matched up against the symbol for that function.
Now, I've already figured out here: How to call a function based on list entry?
...That you can call functions this way, and I know that you can use dir to get a list of strings of all functions in a module. I've been able to get this list, and using a regex, removed the built-in commands and anything else I don't actually want the script to be able to call. The goal is to sandbox here. :)
Now that I have the list of items that are defined in the module, I need to get the symbol for each definition.
For a more visual representation, this is the test module I want to get the symbol for:
def notify(stack,mufenv):
print stack[-1]
It's pulled in via an init script, and I am able to get the notify function's name in a list using:
import mufprims
import re
moddefs=dir(mufprims)
primsfilter=re.compile('__.+__')
primslist=[ 'mufprims.' + x for x in dir(mufprims) if not primsfilter.match(x) ]
print primslist
This returns:
['mufprims.notify']
...which is the exact name of the function I wish to find the symbol for.
I read over http://docs.python.org/library/symtable.html here, but I'm not sure I understand it. I think this is the key to what I want, but I didn't see an example that I could understand. Any ideas how I would get the symbol for the functions I've pulled from the list?
You want to get the function from the mufprims module by using getattr and the function name. Like so:
primslist=[getattr(mufprims, x) for x in dir(mufprims) if not primsfilter.match(x) ]
I thought I might add another possible suggestion for retrieving the functions of an object:
import inspect
# example using os.path
import os.path
results = inspect.getmembers(os.path, inspect.isroutine)
print results
# truncated result
[...,
('splitdrive', <function splitdrive at 0x1002bcb18>),
('splitext', <function splitext at 0x1002bcb90>),
('walk', <function walk at 0x1002bda28>)]
Using dir on the object would essentially give you every member of that object, including non-callable attributes, etc. You could use the inspect module to get a more controlled return type.

Categories