"writing a python binding" vs "using command-line directly" - python

I have a question regarding python bindings.
I have a command-line which exposes some functionality and code is re-factored to provide the functionality through a shared library. I wanted to know what the real advantage that I get from "writing a python binding for the shared library" vs "calling the command line directly".
One obvious advantage I think will be performance, the shared library will link to the same process and the functionality can called within the same process. It will avoid spawning a new process through the command line.
Any other advantages I can get from writing a python binding for such a case ?
Thanks.

I can hardly imagine a case where one would prefer wrapping a library's command line interface over wrapping the library itself. (Unless there is a library that comes with a neat command line interface while being a total mess internally; but the OP indicates that the same functionality available via the command line is easily accessible in terms of library function calls).
The biggest advantage of writing a Python binding is a clearly defined data interface between the library and Python. Ideally, the library can operate directly on memory managed by Python, without any data copying involved.
To illustrate this, let's assume a library function does something more complicated than printing the current time, i.e., it obtains a significant amount of data as an input, performs some operation, and returns a significant amount of data as an output. If the input data is expected as an input file, Python would need to generate this file first. It must make sure that the OS has finished writing the file before calling the library via the command line (I have seen several C libraries where sleep(1) calls were used as a band-aid for this issue...). And Python must get the output back in some way.
If the command line interface does not rely on files but obtains all arguments on the command line and prints the output on stdout, Python probably needs to convert between binary data and string format, not always with the expected results. It also needs to pipe stdout back and parse it. Not a problem, but getting all this right is a lot of work.
What about error handling? Well, the command line interface will probably handle errors by printing error messages on stderr. So Python needs to capture, parse and process these as well. OTOH, the corresponding library function will almost certainly make a success flag accessible to the calling program. This is much more directly usable for Python.
All of this is obviously affecting performance, which you already mentioned.
As another point, if you are developing the library yourself, you will probably find after some time that the Python workflow has made the whole command line interface obsolete, so you can drop supporting it altogether and save yourself a lot of time.
So I think there is a clear case to be made for the Python bindings. To me, one of the biggest strengths of Python is the ease with which such wrappers can be created and maintained. Unfortunately, there are about 7 or 8 equally easy ways to do this. To get started, I recommend ctypes, since it does not require a compiler and will work with PyPy. For best performance use the native C-Python API, which I also found very easy to learn.

Related

What is the best or proper way to allow debugging of generated code?

For various reasons, in one project I generate executable code by means of generating AST from various source files the compiling that to bytecode (though the question could also work for cases where the bytecode is generated directly I guess).
From some experimentation, it looks like the debugger more or less just uses the lineno information embedded in the AST alongside the filename passed to compile in order to provide a representation for the debugger's purposes, however this assumes the code being executed comes from a single on-disk file.
That is not necessarily the case for my project, the executable code can be pieced together from multiple sources, and some or all of these sources may have been fetched over the network, or been retrieved from non-disk storage (e.g. database).
And so my Y questions, which may be the wrong ones (hence the background):
is it possible to provide a memory buffer of some sort, or is it necessary to generate a singular on-disk representation of the "virtual source"?
how well would the debugger deal with jumping around between the different bits and pieces if the virtual source can't or should not be linearised[0]
and just in case, is the assumption of Python only supporting a single contiguous source file correct or can it actually be fed multiple sources somehow?
[0] for instance a web-style literate program would be debugged in its original form, jumping between the code sections, not in the so-called "tangled" form
Some of this can be handled by the trepan3k debugger. For other things various hooks are in place.
First of all it can debug based on bytecode alone. But of course stepping instructions won't be possible if the line number table doesn't exist. And for that reason if for no other, I would add a "line number" for each logical stopping point, such as at the beginning of statements. The numbers don't have to be line numbers, they could just count from 1 or be indexes into some other table. This is more or less how go's Pos type position works.
The debugger will let you set a breakpoint on a function, but that function has to exist and when you start any python program most of the functions you define don't exist. So the typically way to do this is to modify the source to call the debugger at some point. In trepan3k the lingo for this is:
from trepan.api import debug; debug()
Do that in a place where the other functions you want to break on and that have been defined.
And the functions can be specified as methods on existing variables, e.g. self.my_function()
One of the advanced features of this debugger is that will decompile the bytecode to produce source code. There is a command called deparse which will show you the context around where you are currently stopped.
Deparsing bytecode though is a bit difficult so depending on which kind of bytecode you get the results may vary.
As for the virtual source problem, well that situation is somewhat tolerated in the debugger, since that kind of thing has to go on when there is no source. And to facilitate this and remote debugging (where the file locations locally and remotely can be different), we allow for filename remapping.
Another library pyficache is used to for this remapping; it has the ability I believe remap contiguous lines of one file into lines in another file. And I think you could use this over and over again. However so far there hasn't been need for this. And that code is pretty old. So someone would have to beef up trepan3k here.
Lastly, related to trepan3k is a trepan-xpy which is a CPython bytecode debugger which can step bytecode instructions even when the line number table is empty.

Embed python into fortran 90

I was looking at the option of embedding python into fortran90 to add python functionality to my existing fortran90 code. I know that it can be done the other way around by extending python with fortran90 using the f2py from numpy. But, i want to keep my super optimized main loop in fortran and add python to do some additional tasks / evaluate further developments before I can do it in fortran, and also to ease up code maintenance. I am looking for answers for the following questions:
1) Is there a library that already exists from which I can embed python into fortran? (I am aware of f2py and it does it the other way around)
2) How do we take care of data transfer from fortran to python and back?
3) How can we have a call back functionality implemented? (Let me describe the scenario a bit....I have my main_fortran program in Fortran, that call Func1_Python module in python. Now, from this Func1_Python, I want to call another function...say Func2_Fortran in fortran)
4) What would be the impact of embedding the interpreter of python inside fortran in terms of performance....like loading time, running time, sending data (a large array in double precision) across etc.
Thanks a lot in advance for your help!!
Edit1: I want to set the direction of the discussion right by adding some more information about the work I am doing. I am into scientific computing stuff. So, I would be working a lot on huge arrays / matrices in double precision and doing floating point operations. So, there are very few options other than fortran really to do the work for me. The reason i want to include python into my code is that I can use NumPy for doing some basic computations if necessary and extend the capabilities of the code with minimal effort. For example, I can use several libraries available to link between python and some other package (say OpenFoam using PyFoam library).
1. Don't do it
I know that you're wanting to add Python code inside a Fortan program, instead of having a Python program with Fortran extensions. My first piece of advice is to not do this. Fortran is faster than Python at array arithmetic, but Python is easier to write than Fortran, it's easier to extend Python code with OOP techniques, and Python may have access to libraries that are important to you. You mention having a super-optimized main loop in Fortran; Fortran is great for super-optimized inner loops. The logic for passing a Fortran array around in a Python program with Numpy is much more straightforward than what you would have to do to correctly handle a Python object in Fortran.
When I start a scientific computing project from scratch, I always write first in Python, identify performance bottlenecks, and translate those into Fortran. Being able to test faster Fortran code against validated Python code makes it easier to show that the code is working correctly.
Since you have existing code, extending the Python code with a module made in Fortran will require refactoring, but this process should be straightforward. Separate the initialization code from the main loop, break the loop into logical pieces, wrap each of these routines in a Python function, and then your main Python code can call the Fortran subroutines and interleave these with Python functions as appropriate. In this process, you may be able to preserve a lot of the optimizations you have in your main loop in Fortran. F2PY is a reasonably standard tool for this, so it won't be tough to find people who can help you with whatever problems will arise.
2. System calls
If you absolutely must have Fortran code calling Python code, instead of the other way around, the simplest way to do this is to just have the Fortran code write some data to disk, and run the Python code with a SYSTEM or EXECUTE_COMMAND_LINE. If you use EXECUTE_COMMAND_LINE, you can have the Python code output its result to stdout, and the Fortran code can read it as character data; if you have a lot of output (e.g., a big matrix), it would make more sense for the Python code to output a file that the Fortran code then reads. Disk read/write overhead could wind up being prohibitively significant for this. Also, you would have to write Fortran code to output your data, Python code to read it, Python code to output it again, and Fortran code to re-input the data. This code should be straightforward to write and test, but keeping these four parts in sync as you edit the code may turn into a headache.
(This approach is tried in this Stack Overflow question)
3. Embedding Python in C in Fortran
There is no way that I know of to directly pass a Python object in memory to Fortran. However, Fortran code can call C code, and C code can have Python embedded in it. (See the Python tutorial on extending and embedding.) In general, extending Python (like I recommend in point 1) is preferable to embedding it in C/C++. (See Extending Vs. Embedding: There is Only One Correct Decision.) Getting this to work will be a nightmare, because any communication problems between Python and Fortran could happen between Python and C, or between C and Fortran. I don't know if anyone is actually embedding Python in C in Fortran, and so getting help will be difficult.
I have developed the library Forpy that allows you to use Python in Fortran (embedding).
It uses Fortran C interoperability to call Python C API functions.
While I agree that extending (using Fortran in Python) is often preferable, embedding has its uses:
Large, existing Fortran codes might need a substantial amount of refactoring before
they can be used from Python - here embedding can save development time
Replacing a part of an existing code with a Python implementation
Temporarily embedding Python to experiment with a given Fortran code:
for example to test alternative algorithms or to extract intermediary results
Besides embedding, Forpy also supports extending Python.
With Forpy you can write a Python extension module entirely in Fortran.
An advantage to existing tools such as f2py is that you can use Python datatypes
(e. g. to write a function that takes a Python list as argument or a function that returns a Python dict).
Working with existing, possibly legacy, Fortran codes is often very challenging and I
think that developers should have tools at their disposal both for embedding and extending Python.
If you are going to embed Python in Fortran, you will have to do it via Fortran's C interface; that's what ISO_C_BINDING is for. I would caution against embedding Python, not because of the technical difficulty in doing so, but because Python (the language or the community) seems adamantly opposed to Python being used as a subordinate language. The common view is that whatever non-Python language your code is currently written in should be broken up into libraries and used to extend Python, never the other way around. So you will see (as here) more responses trying to convince you that you really don't want to do what you actually want to do than actual technical assistance.
This is not flaming or editorializing or making a moral judgment; this is a simple statement of fact. You will not get help from the Python community if you try to embed Python.
If what you need is functionality beyond what the Fortran language itself supports (e.g. filesystem operations) and you do not specifically need Python and you want a language more expressive than C, you may want to look at embedding Lua instead. Unlike Python, Lua is specifically meant to be embedded so you are likely to face much less social and technical resistance.
There are a number of projects which integrate Fortran and Lua, the most complete one I've seen to date is Aotus. The author is very responsive and the integration process is simple.
Admittedly, this does not answer the original question (how to embed a Python interpreter in a Fortran 90 application) but to be fair, none of the other responses do either. I use Python as my portable general-purpose language of choice these days and I'd really prefer to stick with it when extending our primary products (written in Fortran). For the reasons laid out above, I abandoned my attempts to embed Python and switched to embedding Lua; for social reasons, I feel Lua is a much better technical choice. It's not my first choice but it's workable, at least in my case.
Apologies if I've offended anyone; I'm not trying to pick a fight, just relating my experience when researching this specific topic.
There is a very easy way to do this using f2py. Write your python method and add it as an input to your Fortran subroutine. Declare it in both the cf2py hook and the type declaration as EXTERNAL and also as its return value type, e.g. REAL*8. Your Fortran code will then have a pointer to the address where the python method is stored. It will be SLOW AS MOLASSES, but for testing out algorithms it can be useful. I do this often (I port a lot of ancient spaghetti Fortran to python modules...) It's also a great way to use things like optimised Scipy calls in legacy fortran
I have tried out several approaches to solve the problem and I have found one possibly optimal way of doing it. I will briefly list down the approaches and the results.
1) Embedding via System call: Everytime we want to access python from fortran, we use the system call to execute some python script and exchange data between them. The speed in this approach is limited by the disk read, write (in this age of cache level optimization of code, going to disk is a mortal sin). Also, we need to initialize the interpreter everytime we want to execute the script, which is a considerable overhead. A simple Runge Kutta 4th order method running for 300 timesteps took a whopping 59 seconds to execute.
2) Going from Fortran to Python via C: We use the ISO_C bindings to communicate between Fortran and C; and we embed the Python interpreter inside C. I got parts of it working, but in the meanwhile I found a better way and dropped this idea. I would still like to evaluate this for the sake of completeness though.
3) Using f2py to import Fortran subroutines to Python (Extending) :
Here, we take the main loop out of Fortran and code it in Python (this approach is called Extending Python with Fortran); and we import all Fortran subroutines into Python using f2py (http://cens.ioc.ee/projects/f2py2e/usersguide/). We have the flexibility of having the most important data in any Scientific application, i.e. the outermost loop (generally the time loop) in Python so that we can couple it with other applications. But, we also have the drawback of having to exchange possibly more than needed data between Fortran and Python. The same Runge Kutta 4th order method example took 0.372 seconds to execute.
4) Mimicking Embedding via Extending:
Till now we have seen the two pure approaches of Embedding (the main loop stays in fortran and we call python as needed) and Extending (the main loop stay in python and we call fortran as needed). There is another way of doing it, which I found to be the most optimal. Transferring parts of main loop into Python would lead to an overhead, which may not be necessary all the time. To get rid of this overhead, we can keep the main loop in Fortran which is converted into a subroutine without any changes, have a pseudo main loop in Python, which just calls the main loop in Fortran and the Program executes as if it was our untouched Fortran program. Whenever necessary, we can use a callback function to come back to python with the required data, execute a script and go back to fortran again. In this approach, the Runge Kutta 4th Order method took 0.083 seconds. I profiled the code, and found that the initialization of python interpreter and loading took 0.075 seconds and the program took only 0.008 seconds (which includes 300 callback functions to python). The original fortran code took 0.007 seconds. So, we get almost Fortran like performance with python like flexibility using this approach.
I've just successfully embedded Python into our in-house ~500 KLOC Fortran program with cffi. An important aspect was not to touch the existing code. The program is written in Fortran 95. I wrote a thin 2003 wrapper using the iso_c_binding module that simply imports data from the various modules, gets C pointers to those data and/or wraps Fortran types into C structs, puts everything into a single type/struct and sends off to a C function. This C function happens to be a Python function wrapped with cffi. It unpacks the C struct into a more user-friendly Python object, wraps Fortran arrays as Numpy arrays (no copying) and then either drops into an interactive Python console or runs a Python script, based on user configuration. There is no need to write C code except for a single header file. There is obviously quite some overhead, but this functionality is intended for extendibility, not performance.
I would advise against f2py. It's not maintained well and severely limits the public interface of your Fortran code.
I wrote a library for that, forcallpy, using a C layer which embeds Python expressions interpretation functions, and working specifically on arguments passing between Fortran and Python to make scripts call as easy as possible (uses embedded numpy to directly map Fortran arrays inside ndarrays, use arguments names to know their type in the C/Python side).
You can see some examples in the documentation at readthedocs.
Laurent.

Using Python 3 with Python 2

I have a Python 3 file. I want to use an open-source tool on the internet (nltk), but unfortunately it only supports Python 2. There is no way for me to convert it to Python 3, nor can I convert my Python 3 file to Python 2.
If the user does not give a certain argument (on argparse) then I do something in my file. If the user does give a certain argument, however, I need to use nltk.
Writing a Python 2 script that uses nltk and then executing script that in my Python 3 script
My current idea is to write a script in Python 2 that does what I want with nltk and then run that from my current Python 3 script. However, I don't actually know how to do this.
I found this code: os.system(command) and so I will modify it to be os.system("python py2.py") (where py2.py is my newly written Python 2 file).
I'm not sure if that will work.
I also don't know if that is the most efficient way to solve my problem. I cannot find any information about it on the internet.
The data transferred will probably be quite large. Currently, my test data is about 6600 lines, utf-8. Functionality is more important than how long it takes (to a certain extent) in my case.
Also, how would I pass values from my Python 2 script to my Python 3 script?
Thanks
Is there any other way to do this?
Well, if you're sure you can't convert your script to Python 2, then having one script call the other by running the Python interpreter probably is the best way. (And, this being Python, the best way is, or at least should be, the only way.)
But are you sure? Between the six module, the 3to2 tool, and __future__ statements, it may not be as hard as you think.
Anyway, if you do need to have one script call the other, you should almost never use os.system. As the docs for that function say:
The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function. See the Replacing Older Functions with the subprocess Module section in the subprocess documentation for some helpful recipes.
The simplest version is this:
subprocess.check_call(["python", "py2.py"])
This runs your script, waits for it to finish, and raises an exception if the script returns failure—basically, what you wanted to do with os.system, but better. (For example, it doesn't spawn an unnecessary extra shell, it takes care of error handling, etc.)
That assumes whatever other data you need to share is being shared in some implicit, external way (e.g., by accessing files with the same name). You might be better off passing data to py2.py as command-line arguments and/or stdin, passing data back as via stdout, or even opening an explicit pipe or socket to pass things over. Without knowing more about exactly what you need to do, it's hard to suggest anything, but the docs, especially the section Replacing Older Functions with the subprocess Module have lots of discussion on the options.
To give you an idea, here's a simple example: to pass one of your filename arguments to py2.py, and then get data back from py2.py to py3.py, just have py3.py do this:
py2output = subprocess.check_output(["python", "py2.py", my_args[0]])
And then in py2.py, just print whatever you want to send back.
The Anyone hear when NLTK 3.0 will be out? here in SO points out that...
There's a Python 3 branch:
https://github.com/nltk/nltk/tree/nltk-py3k
The answer is from July 2011. It could be improved since then.
I have just looked at https://github.com/nltk/nltk. There is at least the document that talks about Python 3 port related things https://github.com/nltk/nltk/blob/2and3/web/dev/python3porting.rst.
Here is a longer discussion on NLTK and Python 3 that you may be interested in.
And the Grants to Assist Kivy, NLTK in Porting to Python 3 (published 3 days ago) is directly related to the problem.

Profiling Python via C-api (How to ? )

I have used the Python's C-API to call some Python code in my c code and now I want to profile my python code for bottlenecks. I came across the PyEval_SetProfile API and am not sure how to use it. Do I need to write my own profiling function?
I will be very thankful if you can provide an example or point me to an example.
If you only need to know the amount of time spent in the Python code, and not (for example), where in the Python code the most time is spent, then the Python profiling tools are not what you want. I would write some simple C code that sampled the time before and after the Python interpreter invocation, and use that. Or, C-level profiling tools to measure the Python interpreter as a C function call.
If you need to profile within the Python code, I wouldn't recommend writing your own profile function. All it does is provide you with raw data, you'd still have to aggregate and analyze it. Instead, write a Python wrapper around your Python code that invokes the cProfile module to capture data that you can then examine.

How can I sandbox Python in pure Python?

I'm developing a web game in pure Python, and want some simple scripting available to allow for more dynamic game content. Game content can be added live by privileged users.
It would be nice if the scripting language could be Python. However, it can't run with access to the environment the game runs on since a malicious user could wreak havoc which would be bad. Is it possible to run sandboxed Python in pure Python?
Update: In fact, since true Python support would be way overkill, a simple scripting language with Pythonic syntax would be perfect.
If there aren't any Pythonic script interpreters, are there any other open source script interpreters written in pure Python that I could use? The requirements are support for variables, basic conditionals and function calls (not definitions).
This is really non-trivial.
There are two ways to sandbox Python. One is to create a restricted environment (i.e., very few globals etc.) and exec your code inside this environment. This is what Messa is suggesting. It's nice but there are lots of ways to break out of the sandbox and create trouble. There was a thread about this on Python-dev a year ago or so in which people did things from catching exceptions and poking at internal state to break out to byte code manipulation. This is the way to go if you want a complete language.
The other way is to parse the code and then use the ast module to kick out constructs you don't want (e.g. import statements, function calls etc.) and then to compile the rest. This is the way to go if you want to use Python as a config language etc.
Another way (which might not work for you since you're using GAE), is the PyPy sandbox. While I haven't used it myself, word on the intertubes is that it's the only real sandboxed Python out there.
Based on your description of the requirements (The requirements are support for variables, basic conditionals and function calls (not definitions)) , you might want to evaluate approach 2 and kick out everything else from the code. It's a little tricky but doable.
Roughly ten years after the original question, Python 3.8.0 comes with auditing. Can it help? Let's limit the discussion to hard-drive writing for simplicity - and see:
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r')
or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']): raise IOError('file write forbidden')
addaudithook(block_mischief)
So far exec could easily write to disk:
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
But we can forbid it at will, so that no wicked user can access the disk from the code supplied to exec(). Pythonic modules like numpy or pickle eventually use the Python's file access, so they are banned from disk write, too. External program calls have been explicitly disabled, too.
WRITE_LOCK = True
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("open('/tmp/FILE','a').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("numpy.savetxt('/tmp/FILE', numpy.eye(3))", dict(locals()))
exec("import subprocess; subprocess.call('echo PWNED >> /tmp/FILE', shell=True)", dict(locals()))
An attempt of removing the lock from within exec() seems to be futile, since the auditing hook uses a different copy of locals that is not accessible for the code ran by exec. Please prove me wrong.
exec("print('muhehehe'); del WRITE_LOCK; open('/tmp/FILE','w')", dict(locals()))
...
OSError: file write forbidden
Of course, the top-level code can enable file I/O again.
del WRITE_LOCK
exec("open('/tmp/FILE','w')", dict(locals()))
Sandboxing within Cpython has proven extremely hard and many previous attempts have failed. This approach is also not entirely secure e.g. for public web access:
perhaps hypothetical compiled modules that use direct OS calls cannot be audited by Cpython - whitelisting the safe pure pythonic modules is recommended.
Definitely there is still the possibility of crashing or overloading the Cpython interpreter.
Maybe there remain even some loopholes to write the files on the harddrive, too. But I could not use any of the usual sandbox-evasion tricks to write a single byte. We can say the "attack surface" of Python ecosystem reduces to rather a narrow list of events to be (dis)allowed: https://docs.python.org/3/library/audit_events.html
I would be thankful to anybody pointing me to the flaws of this approach.
EDIT: So this is not safe either! I am very thankful to #Emu for his clever hack using exception catching and introspection:
#!/usr/bin/python3.8
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r') or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']):
raise IOError('file write forbidden')
addaudithook(block_mischief)
WRITE_LOCK = True
exec("""
import sys
def r(a, b):
try:
raise Exception()
except:
del sys.exc_info()[2].tb_frame.f_back.f_globals['WRITE_LOCK']
import sys
w = type('evil',(object,),{'__ne__':r})()
sys.audit('open', None, w)
open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')""", dict(locals()))
I guess that auditing+subprocessing is the way to go, but do not use it on production machines:
https://bitbucket.org/fdominec/experimental_sandbox_in_cpython38/src/master/sandbox_experiment.py
AFAIK it is possible to run a code in a completely isolated environment:
exec somePythonCode in {'__builtins__': {}}, {}
But in such environment you can do almost nothing :) (you can not even import a module; but still a malicious user can run an infinite recursion or cause running out of memory.) Probably you would want to add some modules that will be the interface to you game engine.
I'm not sure why nobody mentions this, but Zope 2 has a thing called Python Script, which is exactly that - restricted Python executed in a sandbox, without any access to filesystem, with access to other Zope objects controlled by Zope security machinery, with imports limited to a safe subset.
Zope in general is pretty safe, so I would imagine there are no known or obvious ways to break out of the sandbox.
I'm not sure how exactly Python Scripts are implemented, but the feature was around since like year 2000.
And here's the magic behind PythonScripts, with detailed documentation: http://pypi.python.org/pypi/RestrictedPython - it even looks like it doesn't have any dependencies on Zope, so can be used standalone.
Note that this is not for safely running arbitrary python code (most of the random scripts will fail on first import or file access), but rather for using Python for limited scripting within a Python application.
This answer is from my comment to a question closed as a duplicate of this one: Python from Python: restricting functionality?
I would look into a two server approach. The first server is the privileged web server where your code lives. The second server is a very tightly controlled server that only provides a web service or RPC service and runs the untrusted code. You provide your content creator with your custom interface. For example you if you allowed the end user to create items, you would have a look up that called the server with the code to execute and the set of parameters.
Here's and abstract example for a healing potion.
{function_id='healing potion', action='use', target='self', inventory_id='1234'}
The response might be something like
{hp='+5' action={destroy_inventory_item, inventory_id='1234'}}
Hmm. This is a thought experiment, I don't know of it being done:
You could use the compiler package to parse the script. You can then walk this tree, prefixing all identifiers - variables, method names e.t.c. (also has|get|setattr invocations and so on) - with a unique preamble so that they cannot possibly refer to your variables. You could also ensure that the compiler package itself was not invoked, and perhaps other blacklisted things such as opening files. You then emit the python code for this, and compiler.compile it.
The docs note that the compiler package is not in Python 3.0, but does not mention what the 3.0 alternative is.
In general, this is parallel to how forum software and such try to whitelist 'safe' Javascript or HTML e.t.c. And they historically have a bad record of stomping all the escapes. But you might have more luck with Python :)
I think your best bet is going to be a combination of the replies thus far.
You'll want to parse and sanitise the input - removing any import statements for example.
You can then use Messa's exec sample (or something similar) to allow the code execution against only the builtin variables of your choosing - most likely some sort of API defined by yourself that provides the programmer access to the functionality you deem relevant.

Categories