I want to know whether there is a PyMOD(DEINIT)_FUNC?, I know that resources are released when the python script exits, but I would like to make my code as responsible for the memory it uses as possible.
I have searched the documentaion of course, and until now I think there is no function called from python core to the c module when the module is unloaded, but I hope there is and I just don't know how to search.
Python 2 does not support module finalisation, no. See bug 9072:
Please accept that Python indeed does not support unloading modules for severe, fundamental, insurmountable, technical problems, in 2.x.
For Python 3, the C API for module initialisation was overhauled (see PEP 3121) and the new PyModuleDef struct has a m_free slot that takes a callback function; use that to clear out your module memory.
This question already has answers here:
create a global function in python
(1 answer)
How to add builtin functions?
(4 answers)
Closed 9 years ago.
I am looking to make a global function in python 3.3 like the print function. In particular we have embedded python in our own application and we want to expose a simple 'debug(value)' global function, available to any script. It is possible for us to do this by attaching the function to a module, however, for convenience it would be easier for it to be global like 'print(value)'.
How do you declare a global function that becomes available to any python file without imports, or is this a black box in python? Is it possible to do from the C side binding?
This is almost always a bad idea, but if you really want to do it…
If you print out or otherwise inspect the print function, you'll see it's in the module builtins. That's your clue. So, you can do this:
debugmodule.py:
import builtins
builtins.debug = debug
Now, after an import debugmodule, any other module can just call debug instead of debugmodule.debug.
Is it possible to do from the C side binding?
In CPython, C extension module can basically do the same thing that a pure Python module does. Or, even more simply, write a _debugmodule.so in C, then a debugmodule.py that imports it and copies debug into builtins.
If you're embedding CPython, you can do this just by injecting the function into the builtins module before starting the script/interactive shell/whatever, or at any later time.
While this definitely works, it's not entirely clear whether this is actually guaranteed to work. If you read the docs, it says:
As an implementation detail, most modules have the name __builtins__ made available as part of their globals. The value of __builtins__ is normally either this module or the value of this module’s __dict__ attribute. Since this is an implementation detail, it may not be used by alternate implementations of Python.
And, at least in CPython, it's actually that __builtins__ module or dict that gets searched as part of the lookup chain, not the builtins module. So, it might be possible that another implementation could look things up in __builtins__ like CPython does, but at the same time not make builtins (or user modifications to it) automatically available in __builtins__, in which case this wouldn't work. (Since CPython is the only 3.x implementation available so far, it's hard to speculate…)
If this doesn't work on some future Python 3.x implementation, the only option I can think of is to get your function injected into each module, instead of into builtins. You could do that with a PEP-302 import hook (which is a lot easier in 3.3 than it was when PEP 302 was written… read The import system for details).
In 2.x, instead of a module builtins that automatically injects things into a magic module __builtins__, there's just a magic module __builtin__ (notice the missing s). You may or may not have to import it (so you might as well, to be safe). And you may or may not be able to change it. But it works in (at least) CPython and PyPy.
So, what's the right way to do it? Simple: instead of import debugmodule, just from debugmodule import debug in all of your other modules. That way it ends up being a module-level global in every module that needs it.
How do I, at run-time (no LD_PRELOAD), intercept/hook a C function like fopen() on Linux, a la Detours for Windows? I'd like to do this from Python (hence, I'm assuming that the program is already running a CPython VM) and also reroute to Python code. I'm fine with just hooking shared library functions. I'd also like to do this without having to change the way the program is run.
One idea is to roll my own tool based on ptrace(), or on rewriting code found with dlsym() or in the PLT, and targeting ctypes-generated C-callable functions, but I thought I'd ask here first. Thanks.
You'll find from one of ltrace developer a way to do this. See this post, which includes a full patch in order to catch dynamically loaded library. In order to call it from python, you'll probably need to make a C module.
google-perftools has their own implementation of Detour under src/windows/preamble_patcher* . This is windows-only at the moment, but I don't see any reason it wouldn't work on any x86 machine except for the fact that it uses win32 functions to look up symbol addresses.
A quick scan of the code and I see these win32 functions used, all of which have linux versions:
GetModuleHandle/GetProcAddress : get the function address. dlsym can do this.
VirtualProtect : to allow modification of the assembly. mprotect.
GetCurrentProcess: getpid
FlushInstructionCache (apparently a nop according to the comments)
It doesn't seem too hard to get this compiled and linked into python, but I'd send a message to the perftools devs and see what they think.
I was wondering whether objects serialized using CPython's cPickle are readable by using IronPython's cPickle; the objects in question do not require any modules outside of the built-ins that both Cpython and IronPython include. Thank you!
If you use the default protocol (0) which is text based, then things should work. I'm not sure what will happen if you use a higher protocol. It's very easy to test this ...
It will work because when you unpickle objects during load() it will use the current definitions of whatever classes you have defined now, not back when the objects were pickled.
IronPython is simply Python with the standard library implemented in C# so that everything emits IL. Both the CPython and the IronPython pickle modules have the same functionality, except one is implemented in C and the other in C#.
I'm developing a web game in pure Python, and want some simple scripting available to allow for more dynamic game content. Game content can be added live by privileged users.
It would be nice if the scripting language could be Python. However, it can't run with access to the environment the game runs on since a malicious user could wreak havoc which would be bad. Is it possible to run sandboxed Python in pure Python?
Update: In fact, since true Python support would be way overkill, a simple scripting language with Pythonic syntax would be perfect.
If there aren't any Pythonic script interpreters, are there any other open source script interpreters written in pure Python that I could use? The requirements are support for variables, basic conditionals and function calls (not definitions).
This is really non-trivial.
There are two ways to sandbox Python. One is to create a restricted environment (i.e., very few globals etc.) and exec your code inside this environment. This is what Messa is suggesting. It's nice but there are lots of ways to break out of the sandbox and create trouble. There was a thread about this on Python-dev a year ago or so in which people did things from catching exceptions and poking at internal state to break out to byte code manipulation. This is the way to go if you want a complete language.
The other way is to parse the code and then use the ast module to kick out constructs you don't want (e.g. import statements, function calls etc.) and then to compile the rest. This is the way to go if you want to use Python as a config language etc.
Another way (which might not work for you since you're using GAE), is the PyPy sandbox. While I haven't used it myself, word on the intertubes is that it's the only real sandboxed Python out there.
Based on your description of the requirements (The requirements are support for variables, basic conditionals and function calls (not definitions)) , you might want to evaluate approach 2 and kick out everything else from the code. It's a little tricky but doable.
Roughly ten years after the original question, Python 3.8.0 comes with auditing. Can it help? Let's limit the discussion to hard-drive writing for simplicity - and see:
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r')
or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']): raise IOError('file write forbidden')
addaudithook(block_mischief)
So far exec could easily write to disk:
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
But we can forbid it at will, so that no wicked user can access the disk from the code supplied to exec(). Pythonic modules like numpy or pickle eventually use the Python's file access, so they are banned from disk write, too. External program calls have been explicitly disabled, too.
WRITE_LOCK = True
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("open('/tmp/FILE','a').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("numpy.savetxt('/tmp/FILE', numpy.eye(3))", dict(locals()))
exec("import subprocess; subprocess.call('echo PWNED >> /tmp/FILE', shell=True)", dict(locals()))
An attempt of removing the lock from within exec() seems to be futile, since the auditing hook uses a different copy of locals that is not accessible for the code ran by exec. Please prove me wrong.
exec("print('muhehehe'); del WRITE_LOCK; open('/tmp/FILE','w')", dict(locals()))
...
OSError: file write forbidden
Of course, the top-level code can enable file I/O again.
del WRITE_LOCK
exec("open('/tmp/FILE','w')", dict(locals()))
Sandboxing within Cpython has proven extremely hard and many previous attempts have failed. This approach is also not entirely secure e.g. for public web access:
perhaps hypothetical compiled modules that use direct OS calls cannot be audited by Cpython - whitelisting the safe pure pythonic modules is recommended.
Definitely there is still the possibility of crashing or overloading the Cpython interpreter.
Maybe there remain even some loopholes to write the files on the harddrive, too. But I could not use any of the usual sandbox-evasion tricks to write a single byte. We can say the "attack surface" of Python ecosystem reduces to rather a narrow list of events to be (dis)allowed: https://docs.python.org/3/library/audit_events.html
I would be thankful to anybody pointing me to the flaws of this approach.
EDIT: So this is not safe either! I am very thankful to #Emu for his clever hack using exception catching and introspection:
#!/usr/bin/python3.8
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r') or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']):
raise IOError('file write forbidden')
addaudithook(block_mischief)
WRITE_LOCK = True
exec("""
import sys
def r(a, b):
try:
raise Exception()
except:
del sys.exc_info()[2].tb_frame.f_back.f_globals['WRITE_LOCK']
import sys
w = type('evil',(object,),{'__ne__':r})()
sys.audit('open', None, w)
open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')""", dict(locals()))
I guess that auditing+subprocessing is the way to go, but do not use it on production machines:
https://bitbucket.org/fdominec/experimental_sandbox_in_cpython38/src/master/sandbox_experiment.py
AFAIK it is possible to run a code in a completely isolated environment:
exec somePythonCode in {'__builtins__': {}}, {}
But in such environment you can do almost nothing :) (you can not even import a module; but still a malicious user can run an infinite recursion or cause running out of memory.) Probably you would want to add some modules that will be the interface to you game engine.
I'm not sure why nobody mentions this, but Zope 2 has a thing called Python Script, which is exactly that - restricted Python executed in a sandbox, without any access to filesystem, with access to other Zope objects controlled by Zope security machinery, with imports limited to a safe subset.
Zope in general is pretty safe, so I would imagine there are no known or obvious ways to break out of the sandbox.
I'm not sure how exactly Python Scripts are implemented, but the feature was around since like year 2000.
And here's the magic behind PythonScripts, with detailed documentation: http://pypi.python.org/pypi/RestrictedPython - it even looks like it doesn't have any dependencies on Zope, so can be used standalone.
Note that this is not for safely running arbitrary python code (most of the random scripts will fail on first import or file access), but rather for using Python for limited scripting within a Python application.
This answer is from my comment to a question closed as a duplicate of this one: Python from Python: restricting functionality?
I would look into a two server approach. The first server is the privileged web server where your code lives. The second server is a very tightly controlled server that only provides a web service or RPC service and runs the untrusted code. You provide your content creator with your custom interface. For example you if you allowed the end user to create items, you would have a look up that called the server with the code to execute and the set of parameters.
Here's and abstract example for a healing potion.
{function_id='healing potion', action='use', target='self', inventory_id='1234'}
The response might be something like
{hp='+5' action={destroy_inventory_item, inventory_id='1234'}}
Hmm. This is a thought experiment, I don't know of it being done:
You could use the compiler package to parse the script. You can then walk this tree, prefixing all identifiers - variables, method names e.t.c. (also has|get|setattr invocations and so on) - with a unique preamble so that they cannot possibly refer to your variables. You could also ensure that the compiler package itself was not invoked, and perhaps other blacklisted things such as opening files. You then emit the python code for this, and compiler.compile it.
The docs note that the compiler package is not in Python 3.0, but does not mention what the 3.0 alternative is.
In general, this is parallel to how forum software and such try to whitelist 'safe' Javascript or HTML e.t.c. And they historically have a bad record of stomping all the escapes. But you might have more luck with Python :)
I think your best bet is going to be a combination of the replies thus far.
You'll want to parse and sanitise the input - removing any import statements for example.
You can then use Messa's exec sample (or something similar) to allow the code execution against only the builtin variables of your choosing - most likely some sort of API defined by yourself that provides the programmer access to the functionality you deem relevant.