Safely execute script coming from user input in Python - python

I'm looking for a safe mechanism in Python to execute potentially unsafe script code coming from a user.
Code example:
def public(v):
print('Allowed to trigger this with '+v)
def secret():
print('Not allowed to trigger this')
unsafe_user_code = '''
def user_function(a):
if 'o' in a:
public('parameter')
a = 'hello'
user_function(a)
'''
run_code(unsafe_user_code, allowed=['public'])
This could be easily achieved with exec() but as I understand there is no way of using exec() in a safe way in Python.
Here are my requirements:
The syntax of the script should ideally be similar to Python (but could also be something like JavaScript)
Standard mechanisms like string operations, if/else/while and definition of own variables and functions need to be available in the script
The script should only be able to execute certain functions (in the example, only public() would be allowed)
I would like to rely on Python-based implementations/libraries only as I don't want to have dependencies on Python-external software
It should not introduce a security risk
The only way I found so far is to use a parsing library, where I would have to define everything by myself (e.g. this one: https://github.com/lark-parser/lark ).
Is there a better way to achieve something like this?
Thanks!

Answer
Here's the full python file if you want to copy:
from copy import copy
def public(v):
print('Allowed to trigger this with '+v)
def secret():
print('Not allowed to trigger this')
unsafe_user_code = '''
def user_function(a):
if 'o' in a:
public('parameter')
# secret()
# If you uncomment it, it'll throw an error saying 'secret' does not exist
a = 'hello'
user_function(a)
'''
def run_code(code_to_run,allowed:list[str]):
g = globals()
allowed_dict = {f"{name}": g[name] for name in allowed}
code = compile(code_to_run,'','exec')
def useless():
pass
useless.__code__ = code
g = copy(useless.__globals__)
for x in g:
del useless.__globals__[x]
for x in allowed_dict:
useless.__globals__[x] = allowed_dict[x]
useless()
run_code(unsafe_user_code, allowed=['public'])
Explaination
I don't think you need a parsing library.
You can use one of python's Built-in Functions called compile().
You can compile code like this:
text_to_compile = "print('hello world')"
code = compile(
text_to_compile, # text to compile
'file_name', # file name
'exec' # compile mode
)
more about compile modes
and then you can run it simply by:
exec(code) # prints 'hello world' in the terminal
or, you can put that code inside of a function and run the function:
def useless():
pass
useless.__code__ = code
useless() # prints 'hello world' in the terminal
now for changing the scope of the function, we can access its __global__ dictionary and remove everything from it.
then we add the functions/variables that we want to it.
# from copy import copy
def run_code(code_to_run,allowed:list[str]):
g = globals()
allowed_dict = {f"{name}": g[name] for name in allowed}
code = compile(code_to_run,'','exec')
def useless():
pass
useless.__code__ = code
g = copy(useless.__globals__)
for x in g:
del useless.__globals__[x]
for x in allowed_dict:
useless.__globals__[x] = allowed_dict[x]
useless()
alright let's talk about what's happening:
def run_code(code_to_run,allowed:list[str]):
the :list[str] is just defining the type of the variable allowed.
g = globals()
using the globals() method we are able to get all of the variables that the function run_code has access to. this includes the public and the secret functions.
allowed_dict = {f"{name}": g[name] for name in allowed}
and use the variable names that you passed down into the run_code function and get the variables from the globals of this function.
the it's pretty much the same as saying:
allowed_dict = {}
for name in allowed:
allowed_dict[name] = g[name]
alright then we compile the code.
we also make a useless function and put the compiled code into the function:
code = compile(code_to_run,'','exec')
def useless():
pass
useless.__code__ = code
after that, we copy the globals of the useless function:
g = copy(useless.__globals__)
we copy the globals in the function so that later in the code we can loop over it and delete it.
we do this because we cannot delete something from a dictionary while a for loop is iterating it.
for x in g:
del useless.__globals__[x]
we delete everything in the useless function's global.
the useless function is an empty function but it's global is filled with all sort of stuff, like the public and the secret functions.
and then we just put all the allowed variables into the globals of the function so that the code inside of the function can use the variables/functions.
for x in allowed_dict:
useless.__globals__[x] = allowed_dict[x]
then we can just run the useless function:
useless()
and that's pretty much it all.

Related

How do I call globals() from another imported function in Python?

I am currently developing an automated function tester in Python.
The purpose of this application is to automatically test if functions are returning an expected return type based on their defined hints.
Currently I have two test functions (one which fails and one which passes), along with the rest of my code in one file. My code utilizes the globals() command in order to scan the Python file for all existing functions and to isolate user-made functions and exclude the default ones.
This initial iteration works well. Now I am trying to import the function and use it from another .py file.
When I run it in the other .py file it still returns results for the functions from the original file instead of the new test-cases in the new file.
Original File - The Main Application
from math import floor
import random
#declaring test variables
test_string = 'test_string'
test_float = float(random.random() * 10)
test_int = int(floor(random.random() * 10))
#Currently supported test types (input and return)
supported_types = ['int', 'float', 'str']
autotest_result = {}
def int_ret(number: int) -> str:
string = "cactusmonster"
return string
def false_test(number: int) -> str:
floating = 3.2222
return floating
def test_typematching():
for name in list(globals()):
if not name.startswith('__'):
try:
return_type = str((globals()[name].__annotations__)['return'])
autotest_result.update({name: return_type.replace("<class '", "").replace("'>", "")})
except:
continue
for func in autotest_result:
if autotest_result[func] != None:
this_func = globals()[func].__annotations__
for arg in this_func:
if arg != 'return':
input_type = str(this_func[arg]).replace("<class '", "").replace("'>", "")
for available in supported_types:
if available == input_type:
func_return = globals()[func]("test_" + input_type)
func_return = globals()[func]("test_" + input_type)
actual_return_type = str(type(func_return)).replace("<class '", "").replace("'>", "")
if actual_return_type == autotest_result[func]:
autotest_result[func] = 'Passed'
else:
autotest_result[func] = 'Failed'
return autotest_result
Test File - Where I Am Importing The "test_typematching()" Function
from auto_test import test_typematching
print(test_typematching())
def int_ret_newfile(number: int) -> str:
string="cactusmonster"
# print(string)
# return type(number)
return string
Regardless if I run my main "auto_test.py" file or the "tester.py" file, I still get the following output:
{'int_ret': 'Passed', 'false_test': 'Failed'}
I am guessing this means that even when I am running the function from auto_test.py on my tester.py file it still just scans itself. I would like it to scan the file where the function is currently being called. For example, I expect it to test the int_ret_newfile function of tester.py.
Any advice or help would be much appreciated.
globals() is a bit of a misnomer. It gets the calling module's __dict__. (Python's true "global" namespace is actually builtins.)
How can globals() get its caller's __dict__ when it's defined in the builtins module? Here's a clue:
PyObject *
PyEval_GetGlobals(void)
{
PyThreadState *tstate = _PyThreadState_GET();
PyFrameObject *current_frame = _PyEval_GetFrame(tstate);
if (current_frame == NULL) {
return NULL;
}
assert(current_frame->f_globals != NULL);
return current_frame->f_globals;
}
globals() is one of those builtins that's implemented in C (in CPython), but you get the gist. It reads the frame globals from the current stack frame, so in Python,
import inspect
inspect.currentframe().f_globals
would do the same thing as globals(). But you can't just put this in a function and expect it to work the same way, because calling it would add a stack frame, and that frame's globals depends on the function's .__globals__ attribute, which is set to the .__dict__ of the module that defined it. You want the caller's frame.
def myglobals():
"""Behaves like the builtin globals(), but written in Python!"""
return inspect.currentframe().f_back.f_globals
You could do the same thing in test_typematching. But walking up the stack to the previous frame like that is a weird thing to do. It can be surprising and brittle. It amounts to passing the caller's frame as an implicit hidden argument, something that normally is not supposed to matter. Consider what happens if you wrap it in a decorator. Now which stack frame are you getting the globals from?
So really, you should be passing in globals() as an explicit argument to test_typematching(), like test_typematching(globals()). A defined and documented parameter would be much less confusing than implicit introspection. "Explicit is better than implicit".
Still, Python's standard library does do this kind of thing occasionally, with globals() itself being a notable example. And exec() can use the current namespace if you don't give it a different one. It's also how super() can now work without arguments in Python 3. So stack frame inspection does have precedent for this kind of use case.

Python: how to get a function based on whether it matches an assigned string to it [duplicate]

I have a function name stored in a variable like this:
myvar = 'mypackage.mymodule.myfunction'
and I now want to call myfunction like this
myvar(parameter1, parameter2)
What's the easiest way to achieve this?
funcdict = {
'mypackage.mymodule.myfunction': mypackage.mymodule.myfunction,
....
}
funcdict[myvar](parameter1, parameter2)
It's much nicer to be able to just store the function itself, since they're first-class objects in python.
import mypackage
myfunc = mypackage.mymodule.myfunction
myfunc(parameter1, parameter2)
But, if you have to import the package dynamically, then you can achieve this through:
mypackage = __import__('mypackage')
mymodule = getattr(mypackage, 'mymodule')
myfunction = getattr(mymodule, 'myfunction')
myfunction(parameter1, parameter2)
Bear in mind however, that all of that work applies to whatever scope you're currently in. If you don't persist them somehow, you can't count on them staying around if you leave the local scope.
def f(a,b):
return a+b
xx = 'f'
print eval('%s(%s,%s)'%(xx,2,3))
OUTPUT
5
Easiest
eval(myvar)(parameter1, parameter2)
You don't have a function "pointer". You have a function "name".
While this works well, you will have a large number of folks telling you it's "insecure" or a "security risk".
Why not store the function itself? myvar = mypackage.mymodule.myfunction is much cleaner.
modname, funcname = myvar.rsplit('.', 1)
getattr(sys.modules[modname], funcname)(parameter1, parameter2)
eval(compile(myvar,'<str>','eval'))(myargs)
compile(...,'eval') allows only a single statement, so that there can't be arbitrary commands after a call, or there will be a SyntaxError. Then a tiny bit of validation can at least constrain the expression to something in your power, like testing for 'mypackage' to start.
I ran into a similar problem while creating a library to handle authentication. I want the app owner using my library to be able to register a callback with the library for checking authorization against LDAP groups the authenticated person is in. The configuration is getting passed in as a config.py file that gets imported and contains a dict with all the config parameters.
I got this to work:
>>> class MyClass(object):
... def target_func(self):
... print "made it!"
...
... def __init__(self,config):
... self.config = config
... self.config['funcname'] = getattr(self,self.config['funcname'])
... self.config['funcname']()
...
>>> instance = MyClass({'funcname':'target_func'})
made it!
Is there a pythonic-er way to do this?

Python function doesnt work properly when imported, only when locally defined

I have a file called functions.py in my project directory which contains a few simple functions.
One of these functions (load()) is supposed to create global variables without requiring the user to define them beforehand.
When I import the functions from the file using
from functions import load
It doesn't work properly, meaning the global variables are not created.
However, when I copy paste the function to define it instead of importing from the file, it works properly. Here is the complete function:
def load(cnv):
global hd1, hd2, variables, datapoints
hd1, hd2, variables, datapoints = [], [], [], []
o = open(cnv)
r = o.readlines()
o.close()
for line in r:
if not line:
pass
elif line.startswith('*'):
hd1.append(line)
elif line.startswith('#'):
hd2.append(line)
if line.startswith('# name'):
line = line.split()
variables.append(line[4])
else:
float_list = []
line = line.split()
for item in line:
float_list.append(float(item))
datapoints.append(float_list)
datapoints = filter(None, datapoints)
global df
df = pd.DataFrame(datapoints, columns = variables)
By the way there is an indent in the whole body of the function after the def() statement. I'm not sure why it doesn't appear when I paste it in this post.
I am very new to programming so I'm taking suggestions on how to potentially improve this code.
Thanks in advance.
The "global" variables are not really global to the scope of the whole script. This is not PHP.
Python has everything structure to the per-module namespaces. The variables are "global" to the scope of the module, where that function is declared. So, when you import the function and call it, it creates the variables which can be accessed as functions.hd1, functions.hd2, etc.
It depends how you use it. For using global you need to first call load function to initialize the variables after which u can use it using the package.variable.
Please find the below implementation.
#functions.py
def load():
global hd1
hd1 = []
#test.py
import functions
functions.load()
functions.hd1.append('hey')
print functions.hd1
OR
#test.py
from functions import load
load()
from functions import *
hd1.append('hey')
print hd1

executing python code from string loaded into a module

I found the following code snippet that I can't seem to make work for my scenario (or any scenario at all):
def load(code):
# Delete all local variables
globals()['code'] = code
del locals()['code']
# Run the code
exec(globals()['code'])
# Delete any global variables we've added
del globals()['load']
del globals()['code']
# Copy k so we can use it
if 'k' in locals():
globals()['k'] = locals()['k']
del locals()['k']
# Copy the rest of the variables
for k in locals().keys():
globals()[k] = locals()[k]
I created a file called "dynamic_module" and put this code in it, which I then used to try to execute the following code which is a placeholder for some dynamically created string I would like to execute.
import random
import datetime
class MyClass(object):
def main(self, a, b):
r = random.Random(datetime.datetime.now().microsecond)
a = r.randint(a, b)
return a
Then I tried executing the following:
import dynamic_module
dynamic_module.load(code_string)
return_value = dynamic_module.MyClass().main(1,100)
When this runs it should return a random number between 1 and 100. However, I can't seem to get the initial snippet I found to work for even the simplest of code strings. I think part of my confusion in doing this is that I may misunderstand how globals and locals work and therefore how to properly fix the problems I'm encountering. I need the code string to use its own imports and variables and not have access to the ones where it is being run from, which is the reason I am going through this somewhat over-complicated method.
You should not be using the code you found. It is has several big problems, not least that most of it doesn't actually do anything (locals() is a proxy, deleting from it has no effect on the actual locals, it puts any code you execute in the same shared globals, etc.)
Use the accepted answer in that post instead; recast as a function that becomes:
import sys, imp
def load_module_from_string(code, name='dynamic_module')
module = imp.new_module(name)
exec(code, mymodule.__dict__)
return module
then just use that:
dynamic_module = load_module_from_string(code_string)
return_value = dynamic_module.MyClass().main(1, 100)
The function produces a new, clean module object.
In general, this is not how you should dynamically import and use external modules. You should be using __import__ within your function to do this. Here's a simple example that worked for me:
plt = __import__('matplotlib.pyplot', fromlist = ['plt'])
plt.plot(np.arange(5), np.arange(5))
plt.show()
I imagine that for your specific application (loading from code string) it would be much easier to save the dynamically generated code string to a file (in a folder containing an __init__.py file) and then to call it using __import__. Then you could access all variables and functions of the code as parts of the imported module.
Unless I'm missing something?

Python creating dynamic global variable from list

I am trying to make generic config, and thus config parser. There are two config files say A and B. I want to parse sections and make global values from them according to hardcoded list.
Here is an example:
in config file:
[section]
var1 = value1
var2 = value2
In global scope:
some_global_list = [ ["var1","var2"], ["var3","var4"] ]
in function to unpack this values, by ConfigParser:
configparser = ConfigParser.RawConfigParser()
configparser.read(some_config_filename)
for variables in some_global_list:
globals()[section]=dict()
for element in configparser.items(section):
globals()[section].update({element[0]:element[1]})
This works nicely...however. Scope of globals() seem to be limited to function which is obviously not what I intended. I can access variable only while in that function.
Could someone share better yet simple idea?
I know that i might move code to main and not to worry, but I don't think it is a good idea.
I thought also about making some generator (sorry for pseudocode here):
in global scope:
for x in some_global_list:
globals()[x] = x
also tried adding this to function:
for x in some_global_list[0]:
global x
but got nowhere.
Edit :
After discussion below, here it is:
Problem solved like this:
removed whole configuration from main script to module
imported (from module import somefunction) config from module
removed globals() in fact didnt need them, since function was changed a little like so:
in function:
def somefunction:
#(...)
configparser = ConfigParser.RawConfigParser()
configparser.read(some_config_filename)
temp_list=[]
for variables in some_global_list:
tmp=dict()
for element in configparser.items(section):
tmp.update({element[0]:element[1]})
temp_list.append (tmp)
return temp_list #this is pack for one file.
now in main script
tmp=[]
for i,conf_file in enumerate([args.conf1,args.conf2,args.conf3]):
if conf_file:
try:
tmp.append([function(params...)])
except:
#handling here
#but since i needed those config names as global variables
for j,variable_set in enumerate(known_variable_names[i]):
globals()[variable_set] = tmp[i][j]
so unfortunate hack presists. But seems to work. Thx for Your help guys.
I'm accepting (if thats possible) below answer since it gave me good idea :)
A simple way to solve this issue is in your application package within the __init__.py you can do something similar to the following:
app_config = read_config()
def read_config():
configparser = ConfigParser.RawConfigParser()
configparser.read(some_config_filename)
return configparser.as_dict() #An imaginery method which returns the vals as dict.
The "app_config" variable can be imported into any other module within the application.

Categories