This subject has been disturbing me for some time.
For my Python project I wanted to be able to support Python versions 2.4 to 3.1. I thought a bit about how to do this, and eventually decided to have four separate forks of the source code for four different versions of Python: 2.4, 2.5, 2.6 and 3.1.
I have come to view that as a bad decision, mainly because of Python's distribution annoyances, which I now have to do four times instead of one.
The question is, what to do?
My project is in the scientific computing field. I got the impression that there are still many people who depend on Python 2.4.
Someone suggested I just write my entire project for 2.4, but that is unacceptable for me. That will mean I could not use context managers, and that is something I will not give up on.
How do ordinary Python projects support 2.4? Do they avoid using context managers?
Also, is there any choice but having a separate fork for Python 3.1? I know there are all kinds of hacks for making the same code run on 2.x and 3.x, but one of the reasons I like Python is because the code is beautiful, and I will not tolerate making it ugly with compatibility hacks.
Please, give me your opinion.
Yes, you need to write for Python 2.4 syntax to support all of 2.4 - 2.7 in the same codebase.
Some changes in Python 2.6 and 2.7 aim to make it a bit easier to write compatible code with 3.x, but you have to drop support for 2.5 and below to do that.
There seem be different answers to your problem.
First, if you want to offer all functions for all python versions then yes, you're probably stuck with using the smallest possible functionality subset - hence writing your code for Python 2.4. Or you could backport features from newer interpreters if they're pure python (that's not the case of context managers or coroutines neither).
Or you could split version support into features - if you think there's one (optional) feature which would have great benefit from, let's say, context managers, you can make it available in a separate module and just say that 2.4 users don't have that feature.
In order to support Python 3 take a look at the 2to3 helper, if you write your code properly there's a fair chance you won't need to maintain two separate codebases.
If the differences between versions are not extreme, you can try isolating them into a separate package or module in which you write version-specific code to act as an adaptation layer.
In a trivial fashion, this can be done without the separate module in simple cases, such as when a new version of Python makes standard a package that used to be external, such as (for example) simplejson. We have something similar to this in some code:
try:
import simplejson as json
except ImportError:
import json
For non-trivial stuff, such as what you probably have, you wouldn't want such things scattered randomly throughout your code base, so you should collect it all together in one place, when possible, and make that the sole section of your code that is version-specific.
This can't work so well for things where the syntax is different, such as your comment about wanting to use context managers. Sure, you could put the context manager code in a separate module, but that will likely complicate the places where you'd be using it. In such cases, you might backport certain critical features (I think context managers could be simulated somewhat easily) to this adapter module.
Definitely having separate codebases is about the worst thing you could do, so I'd certainly recommend working away from that. At the least, don't arbitrarily use features from newer versions of Python, since although it may look nice to have them in the code (simplifying a particular block of logic perhaps), the fact that you have to duplicate that logic by forking the codebase, even on a single module, is going to more than negate the benefits.
We stick with older versions for legacy code, tweaking as new releases come out to support them but maintaining support for the older ones, sometimes with small adapter layers. At some point, a major release of our code shows up on the schedule, and we consider whether it's time to drop support for an older Python. When that happens, we try to leapfrog several versions, going (for example) from 2.4 to 2.6 directly, and only then start really taking advantage of the new syntax and non-adaptable features.
First of call you need to keep in mind that Python 2.x shares mostly the same syntax which is backward compatible, new features & additions aside. There are other things to consider that aren't necessarily errors, such as DeprecationWarning messages that while not detrimental, are ugly and can cause confusion.
Python 3.x is backward-INcompatible by design and intends to leave all of the old cruft behind. Python 2.6 introduced many changes that are also in Python 3.x to help ease the transition. To see all of them I would recommend reading up on the What's New in Python 2.6 document. For this reason, it is very possible to write code for Python 2.6 that will also run in Python 3.1, but that is not without its caveats.
Even still there are many minor syntax changes even between 2.x versions that will require you you wrap a lot of your code in try/except blocks, so if this is what you're willing to do then having a 2.x and 3.x branch is totally possible. I think you'll find that you'll be doing a lot of attribute and type tests on your objects to do what you want to do.
I would recommend you check out the code of major projects out there that support various Python versions. Twisted Matrix is the first one that comes to mind. Their code is a wonderful example of how Python code should be written.
In the end, what you're setting out to do will not be easy, so prepare yourself for a lot of work!
You could try virtualenv and distribute your application using a single Python version. This may or may not be practical in your case though.
We have related problem, a large system that supports both jython and cpython back to 2.4. Basically you need to isolate code that needs to be written differently into a hopefully small set of modules, and have things get imported conditionally.
# module svn.py
import sys
if sys.platform.startswith('java'):
from jythonsvn import *
else:
from nativesvn import *
In your example you would use tests against sys.version_info, presumably. You could define some simple things in a utility module, that you would use like: from util import *
# module util.py
import sys
if sys.exc_info[0] == 2:
if sys.exc_info[1] == 4:
from util_py4 import *
...
Then things in util_py4.py like:
def any(seq): # define workaround functions where possible
for a in seq:
if a: return True
return False
...
Although this is a different problem than porting (since you want to continue to support), this link gives some useful guidance http://python3porting.com/preparing.html (as do a variety of other articles about porting python 2.x).
Your comment that you just cannot live without context managers is a little confusing though.
While context managers are powerful and make the code more readable and minimize the risk of errors, you just won't be able to have them in the code of your 2.4 version.
### 2.5 (with appropriate future import) and later
with open('foo','rb')as myfile:
# do something with myfile
### 2.4 and earlier
myfile = None
try:
myfile = open('foo','rb')
# do something with myfile
finally:
if myfile: myfile.close()
Since you want to support 2.4 you'll have a body of code that just has to have the second syntax. Will it really be more elegant to write it BOTH ways?
Related
I'm creating a project in Python that I intend to be able to run under both Python 2.7 and Python 3. I created a class where it became apparent that a nice piece of functionality was available under Python 3 using some Python 3-specific functionality. I don't believe I can replicate the same functionality in Python 2.7, and am not trying to do so. But I intend for the Python 3 app to perhaps have some additional functionality as a consequence.
Anyway, I was hoping that so long as the 2.7 app never called the functions that used the 3.x functionality I'd be okay. But, no, because the presence of the code generates a compile-time error in 2.7, so it spits the dummy despite the function never being called at runtime. And because of Python's lack of any compile-time guards I'm not entirely sure what the best solution is.
I guess I could create a subclass of MyClass, call it MyClass3, put it in another module and add the extra functions there. But that makes a lot of things substantially grubbier...many more split code paths based on sys.version_info, circular inclusion problems unless I do a lot of file-splitting and...(waves hand). It's a mess that way. But maybe it's the only option available?
EDIT:
The original question made reference to "yield from" which is why the answer below discusses it. But the original question was not actually looking for advice on how to get "yield from" working in 2.7, but the moderator seemed to THINK this was what the question was about and flagged it as a duplicate accordingly.
As it happened, just as I edited the question to focus it on the issue of organizing the project to avoid compile errors (and to remove references to "yield from"), an answer came in that referenced the yield from issue and turned out to be super-useful.
yield from was backported to Python 2.7 in the following module: yieldfrom.
There is also a SO question about implementing yield from functionality back to python 2 that you may find useful as well as a blog post on the same topic.
AFAIK, there is no official backport of the functionality so there is nothing like from __future__ import yieldfrom that one could expect (please correct if you know otherwise).
Why does Python seem slower, on average, than C/C++? I learned Python as my first programming language, but I've only just started with C and already I feel I can see a clear difference.
Python is a higher level language than C, which means it abstracts the details of the computer from you - memory management, pointers, etc, and allows you to write programs in a way which is closer to how humans think.
It is true that C code usually runs 10 to 100 times faster than Python code if you measure only the execution time. However if you also include the development time Python often beats C. For many projects the development time is far more critical than the run time performance. Longer development time converts directly into extra costs, fewer features and slower time to market.
Internally the reason that Python code executes more slowly is because code is interpreted at runtime instead of being compiled to native code at compile time.
Other interpreted languages such as Java bytecode and .NET bytecode run faster than Python because the standard distributions include a JIT compiler that compiles bytecode to native code at runtime. The reason why CPython doesn't have a JIT compiler already is because the dynamic nature of Python makes it difficult to write one. There is work in progress to write a faster Python runtime so you should expect the performance gap to be reduced in the future, but it will probably be a while before the standard Python distribution includes a powerful JIT compiler.
CPython is particularly slow because it has no Just in Time optimizer (since it's the reference implementation and chooses simplicity over performance in certain cases). Unladen Swallow is a project to add an LLVM-backed JIT into CPython, and achieves massive speedups. It's possible that Jython and IronPython are much faster than CPython as well as they are backed by heavily optimized virtual machines (JVM and .NET CLR).
One thing that will arguably leave Python slower however, is that it's dynamically typed, and there is tons of lookup for each attribute access.
For instance calling f on an object A will cause possible lookups in __dict__, calls to __getattr__, etc, then finally call __call__ on the callable object f.
With respect to dynamic typing, there are many optimizations that can be done if you know what type of data you are dealing with. For example in Java or C, if you have a straight array of integers you want to sum, the final assembly code can be as simple as fetching the value at the index i, adding it to the accumulator, and then incrementing i.
In Python, this is very hard to make code this optimal. Say you have a list subclass object containing ints. Before even adding any, Python must call list.__getitem__(i), then add that to the "accumulator" by calling accumulator.__add__(n), then repeat. Tons of alternative lookups can happen here because another thread may have altered for example the __getitem__ method, the dict of the list instance, or the dict of the class, between calls to add or getitem. Even finding the accumulator and list (and any variable you're using) in the local namespace causes a dict lookup. This same overhead applies when using any user defined object, although for some built-in types, it's somewhat mitigated.
It's also worth noting, that the primitive types such as bigint (int in Python 3, long in Python 2.x), list, set, dict, etc, etc, are what people use a lot in Python. There are tons of built in operations on these objects that are already optimized enough. For example, for the example above, you'd just call sum(list) instead of using an accumulator and index. Sticking to these, and a bit of number crunching with int/float/complex, you will generally not have speed issues, and if you do, there is probably a small time critical unit (a SHA2 digest function, for example) that you can simply move out to C (or Java code, in Jython). The fact is, that when you code C or C++, you are going to waste lots of time doing things that you can do in a few seconds/lines of Python code. I'd say the tradeoff is always worth it except for cases where you are doing something like embedded or real time programming and can't afford it.
Compilation vs interpretation isn't important here: Python is compiled, and it's a tiny part of the runtime cost for any non-trivial program.
The primary costs are: the lack of an integer type which corresponds to native integers (making all integer operations vastly more expensive), the lack of static typing (which makes resolution of methods more difficult, and means that the types of values must be checked at runtime), and the lack of unboxed values (which reduce memory usage, and can avoid a level of indirection).
Not that any of these things aren't possible or can't be made more efficient in Python, but the choice has been made to favor programmer convenience and flexibility, and language cleanness over runtime speed. Some of these costs may be overcome by clever JIT compilation, but the benefits Python provides will always come at some cost.
The difference between python and C is the usual difference between an interpreted (bytecode) and compiled (to native) language. Personally, I don't really see python as slow, it manages just fine. If you try to use it outside of its realm, of course, it will be slower. But for that, you can write C extensions for python, which puts time-critical algorithms in native code, making it way faster.
Python is typically implemented as a scripting language. That means it goes through an interpreter which means it translates code on the fly to the machine language rather than having the executable all in machine language from the beginning. As a result, it has to pay the cost of translating code in addition to executing it. This is true even of CPython even though it compiles to bytecode which is closer to the machine language and therefore can be translated faster. With Python also comes some very useful runtime features like dynamic typing, but such things typically cannot be implemented even on the most efficient implementations without heavy runtime costs.
If you are doing very processor-intensive work like writing shaders, it's not uncommon for Python to be somewhere around 200 times slower than C++. If you use CPython, that time can be cut in half but it's still nowhere near as fast. With all those runtmie goodies comes a price. There are plenty of benchmarks to show this and here's a particularly good one. As admitted on the front page, the benchmarks are flawed. They are all submitted by users trying their best to write efficient code in the language of their choice, but it gives you a good general idea.
I recommend you try mixing the two together if you are concerned about efficiency: then you can get the best of both worlds. I'm primarily a C++ programmer but I think a lot of people tend to code too much of the mundane, high-level code in C++ when it's just a nuisance to do so (compile times as just one example). Mixing a scripting language with an efficient language like C/C++ which is closer to the metal is really the way to go to balance programmer efficiency (productivity) with processing efficiency.
Comparing C/C++ to Python is not a fair comparison. Like comparing a F1 race car with a utility truck.
What is surprising is how fast Python is in comparison to its peers of other dynamic languages. While the methodology is often considered flawed, look at The Computer Language Benchmark Game to see relative language speed on similar algorithms.
The comparison to Perl, Ruby, and C# are more 'fair'
Aside from the answers already posted, one thing is Python's ability to change things during runtime, which you can't do in other languages such as C. You can add member functions to classes as you go.
Also, Pythons' dynamic nature makes it impossible to say what type of parameters will be passed to a function, which in turn makes optimizing a whole lot harder.
RPython seems to be a way of getting around the optimization problem.
Still, it'll probably won't be near the performance of C for number-crunching and the like.
C and C++ compile to native code- that is, they run directly on the CPU. Python is an interpreted language, which means that the Python code you write must go through many, many stages of abstraction before it can become executable machine code.
Python is a high-level programming language. Here is how a python script runs:
The python source code is first compiled into Byte Code. Yes, you heard me right! Though Python is an interpreted language, it first gets compiled into byte code. This byte code is then interpreted and executed by the Python Virtual Machine(PVM).
This compilation and execution are what make Python slower than other low-level languages such as C/C++. In languages such as C/C++, the source code is compiled into binary code which can be directly executed by the CPU thus making their execution efficient than that of Python.
This answer applies to python3. Most people do not know that a JIT-like compile occurs whenever you use the import statement. CPython will search for the imported source file (.py), take notice of the modification date, then look for compiled-to-bytecode file (.pyc) in a subfolder named "_ _ pycache _ _" (dunder pycache dunder). If everything matches then your program will use that bytecode file until something changes (you change the source file or upgrade Python)
But this never happens with the main program which is usually started from a BASH shell, interactively or via. Here is an example:
#!/usr/bin/python3
# title : /var/www/cgi-bin/name2.py
# author: Neil Rieck
# edit : 2019-10-19
# ==================
import name3 # name3.py will be cache-checked and/or compiled
import name4 # name4.py will be cache-checked and/or compiled
import name5 # name5.py will be cache-checked and/or compiled
#
def main():
#
# code that uses the imported libraries goes here
#
if __name__ == "__main__":
main()
#
Once executed, the compiled output code will be discarded. However, your main python program will be compiled if you start up via an import statement like so:
#!/usr/bin/python3
# title : /var/www/cgi-bin/name1
# author: Neil Rieck
# edit : 2019-10-19
# ==================
import name2 # name2.py will be cache-checked and/or compiled
#name2.main() #
And now for the caveats:
if you were testing code interactively in the Apache area, your compiled file might be saved with privs that Apache can't read (or write on a recompile)
some claim that the subfolder "_ _ pycache _ _" (dunder pycache dunder) needs to be available in the Apache config
will SELinux allow CPython to write to subfolder (this was a problem in CentOS-7.5 but I believe a patch has been made available)
One last point. You can access the compiler yourself, generate the pyc files, then change the protection bits as a workaround to any of the caveats I've listed. Here are two examples:
method #1
=========
python3
import py_compile
py_compile("name1.py")
exit()
method #2
=========
python3 -m py_compile name1.py
python is interpreted language is not complied and its not get combined with CPU hardware
but I have a solutions for increase python as a faster programing language
1.Use python3 for run and code python command like Ubuntu or any Linux distro use python3 main.py and update regularly your python so you python3 framework modules and libraries i will suggest use pip 3.
2.Use [Numba][1] python framework with JIT compiler this framework use for data visualization but you can use for any program this framework use GPU acceleration of your program.
3.Use [Profiler optimizing][1] so this use for see with function or syntax for bit longer or faster also have use full to change syntax as a faster for python its very god and work full so this give a with function or syntax using much more time execution of code.
4.Use multi threading so making multiprocessing of program for python so use CPU cores and threads so this make your code much more faster.
5.Using C,C#,C++ increasing python much more faster i think its called parallel programing use like a [cpython][1] .
6.Debug your code for test your code to make not bug in your code so then you will get little bit your code faster also have one more thing Application logging is for debugging code.
and them some low things that makes your code faster:
1.Know the basic data structures for using good syntax use make best code.
2.make a best code have Reduce memory footprinting.
3.Use builtin functions and libraries.
4.Move calculations outside the loop.
5.keep your code base small.
so using this thing then get your code much more faster yes so using this python not a slow programing language
I have been using a lot of context managers as a clean way of composing various setup/teardown situations. Since my deployments target Python 2.6, this means using contextlib.nested.
Lately I've been interested in supporting both Python 2.x and Python 3 with the same code base. This has been possible with some projects, but I'm running into trouble in the case of context managers because:
contextlib.nested isn't supported in Python 3
Python-3 style nested context managers (e.g., with aa() as a, bb() as b: ...) aren't supported in 2.6.
There is a basic syntactic incompatibility here. For various reasons beyond my control, 2.7 may be difficult to get into production for now, but I'd like to future-proof the code as much as possible, hence the Python 3 interest.
Can anyone suggest a workaround for supporting nested context managers in the same code base for 2.6 and 3.x? Or is this a lost cause?
From the docs:
This function has two major quirks that have led to it being deprecated. Firstly, as the context managers are all constructed before the function is invoked, the __new__() and __init__() methods of the inner context managers are not actually covered by the scope of the outer context managers. That means, for example, that using nested() to open two files is a programming error as the first file will not be closed promptly if an exception is thrown when opening the second file.
Secondly, if the__enter__() method of one of the inner context managers raises an exception that is caught and suppressed by the __exit__() method of one of the outer context managers, this construct will raise RuntimeError rather than skipping the body of the with statement.
Thus in almost all cases the correct answer is JBernardo's. It's a bit more indenting but it's a bit less buggy, too.
Just nest them
with aa() as a:
with bb() as b:
#some code here
If the quirks of nested that Veedrac mentioned aren't an issue for you, you can just copy the code from the Python standard library.
If they do bother you, then your only choices are to manually nest them, or to drop Python 2.6 support. It really doesn't matter if you use two code-bases or one for this. If this is the case, then the only way it will work in Python 2.6 would be to nest them. I guess you could play with writing some kind of custom 2to3 fixer that translates your unnested 2.7 code into nested 2.6 code. But honestly, it will be less painful to just use a single code-base with nested managers until you can drop 2.6 support.
You can always reimplement nested on your own and keep it in a compatibility.py file within the project. This is often what is done to cross versions.
Edit: I see that #JBernardo already mentioned this solution in a comment.
I'm developing a web game in pure Python, and want some simple scripting available to allow for more dynamic game content. Game content can be added live by privileged users.
It would be nice if the scripting language could be Python. However, it can't run with access to the environment the game runs on since a malicious user could wreak havoc which would be bad. Is it possible to run sandboxed Python in pure Python?
Update: In fact, since true Python support would be way overkill, a simple scripting language with Pythonic syntax would be perfect.
If there aren't any Pythonic script interpreters, are there any other open source script interpreters written in pure Python that I could use? The requirements are support for variables, basic conditionals and function calls (not definitions).
This is really non-trivial.
There are two ways to sandbox Python. One is to create a restricted environment (i.e., very few globals etc.) and exec your code inside this environment. This is what Messa is suggesting. It's nice but there are lots of ways to break out of the sandbox and create trouble. There was a thread about this on Python-dev a year ago or so in which people did things from catching exceptions and poking at internal state to break out to byte code manipulation. This is the way to go if you want a complete language.
The other way is to parse the code and then use the ast module to kick out constructs you don't want (e.g. import statements, function calls etc.) and then to compile the rest. This is the way to go if you want to use Python as a config language etc.
Another way (which might not work for you since you're using GAE), is the PyPy sandbox. While I haven't used it myself, word on the intertubes is that it's the only real sandboxed Python out there.
Based on your description of the requirements (The requirements are support for variables, basic conditionals and function calls (not definitions)) , you might want to evaluate approach 2 and kick out everything else from the code. It's a little tricky but doable.
Roughly ten years after the original question, Python 3.8.0 comes with auditing. Can it help? Let's limit the discussion to hard-drive writing for simplicity - and see:
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r')
or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']): raise IOError('file write forbidden')
addaudithook(block_mischief)
So far exec could easily write to disk:
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
But we can forbid it at will, so that no wicked user can access the disk from the code supplied to exec(). Pythonic modules like numpy or pickle eventually use the Python's file access, so they are banned from disk write, too. External program calls have been explicitly disabled, too.
WRITE_LOCK = True
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("open('/tmp/FILE','a').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("numpy.savetxt('/tmp/FILE', numpy.eye(3))", dict(locals()))
exec("import subprocess; subprocess.call('echo PWNED >> /tmp/FILE', shell=True)", dict(locals()))
An attempt of removing the lock from within exec() seems to be futile, since the auditing hook uses a different copy of locals that is not accessible for the code ran by exec. Please prove me wrong.
exec("print('muhehehe'); del WRITE_LOCK; open('/tmp/FILE','w')", dict(locals()))
...
OSError: file write forbidden
Of course, the top-level code can enable file I/O again.
del WRITE_LOCK
exec("open('/tmp/FILE','w')", dict(locals()))
Sandboxing within Cpython has proven extremely hard and many previous attempts have failed. This approach is also not entirely secure e.g. for public web access:
perhaps hypothetical compiled modules that use direct OS calls cannot be audited by Cpython - whitelisting the safe pure pythonic modules is recommended.
Definitely there is still the possibility of crashing or overloading the Cpython interpreter.
Maybe there remain even some loopholes to write the files on the harddrive, too. But I could not use any of the usual sandbox-evasion tricks to write a single byte. We can say the "attack surface" of Python ecosystem reduces to rather a narrow list of events to be (dis)allowed: https://docs.python.org/3/library/audit_events.html
I would be thankful to anybody pointing me to the flaws of this approach.
EDIT: So this is not safe either! I am very thankful to #Emu for his clever hack using exception catching and introspection:
#!/usr/bin/python3.8
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r') or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']):
raise IOError('file write forbidden')
addaudithook(block_mischief)
WRITE_LOCK = True
exec("""
import sys
def r(a, b):
try:
raise Exception()
except:
del sys.exc_info()[2].tb_frame.f_back.f_globals['WRITE_LOCK']
import sys
w = type('evil',(object,),{'__ne__':r})()
sys.audit('open', None, w)
open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')""", dict(locals()))
I guess that auditing+subprocessing is the way to go, but do not use it on production machines:
https://bitbucket.org/fdominec/experimental_sandbox_in_cpython38/src/master/sandbox_experiment.py
AFAIK it is possible to run a code in a completely isolated environment:
exec somePythonCode in {'__builtins__': {}}, {}
But in such environment you can do almost nothing :) (you can not even import a module; but still a malicious user can run an infinite recursion or cause running out of memory.) Probably you would want to add some modules that will be the interface to you game engine.
I'm not sure why nobody mentions this, but Zope 2 has a thing called Python Script, which is exactly that - restricted Python executed in a sandbox, without any access to filesystem, with access to other Zope objects controlled by Zope security machinery, with imports limited to a safe subset.
Zope in general is pretty safe, so I would imagine there are no known or obvious ways to break out of the sandbox.
I'm not sure how exactly Python Scripts are implemented, but the feature was around since like year 2000.
And here's the magic behind PythonScripts, with detailed documentation: http://pypi.python.org/pypi/RestrictedPython - it even looks like it doesn't have any dependencies on Zope, so can be used standalone.
Note that this is not for safely running arbitrary python code (most of the random scripts will fail on first import or file access), but rather for using Python for limited scripting within a Python application.
This answer is from my comment to a question closed as a duplicate of this one: Python from Python: restricting functionality?
I would look into a two server approach. The first server is the privileged web server where your code lives. The second server is a very tightly controlled server that only provides a web service or RPC service and runs the untrusted code. You provide your content creator with your custom interface. For example you if you allowed the end user to create items, you would have a look up that called the server with the code to execute and the set of parameters.
Here's and abstract example for a healing potion.
{function_id='healing potion', action='use', target='self', inventory_id='1234'}
The response might be something like
{hp='+5' action={destroy_inventory_item, inventory_id='1234'}}
Hmm. This is a thought experiment, I don't know of it being done:
You could use the compiler package to parse the script. You can then walk this tree, prefixing all identifiers - variables, method names e.t.c. (also has|get|setattr invocations and so on) - with a unique preamble so that they cannot possibly refer to your variables. You could also ensure that the compiler package itself was not invoked, and perhaps other blacklisted things such as opening files. You then emit the python code for this, and compiler.compile it.
The docs note that the compiler package is not in Python 3.0, but does not mention what the 3.0 alternative is.
In general, this is parallel to how forum software and such try to whitelist 'safe' Javascript or HTML e.t.c. And they historically have a bad record of stomping all the escapes. But you might have more luck with Python :)
I think your best bet is going to be a combination of the replies thus far.
You'll want to parse and sanitise the input - removing any import statements for example.
You can then use Messa's exec sample (or something similar) to allow the code execution against only the builtin variables of your choosing - most likely some sort of API defined by yourself that provides the programmer access to the functionality you deem relevant.
Is there something similar to Pylint, that will look at a Python script (or run it), and determine which version of Python each line (or function) requires?
For example, theoretical usage:
$ magic_tool <EOF
with something:
pass
EOF
1: 'with' statement requires Python 2.6 or greater
$ magic_tool <EOF
class Something:
#classmethod
def blah(cls):
pass
EOF
2: classmethod requires Python 2.2 or greater
$ magic_tool <EOF
print """Test
"""
EOF
1: Triple-quote requires Python 1.5 of later
Is such a thing possible? I suppose the simplest way would be to have all Python versions on disc, run the script with each one and see what errors occur..
Inspired by this excellent question, I recently put together a script that tries to do this. You can find it on github at pyqver.
It's reasonably complete but there are some aspects that are not yet handled (as mentioned in the README file). Feel free to fork and improve it!
Not an actual useful answer but here it goes anyway.
I think this should be doable to make (though probably quite an exercise), for example you could make sure you have all the official grammars for the versions you want to check, like this one .
Then parse the bit of code starting with the first grammar version.
Next you need a similar map of all the built-in module namespaces and parse the code again starting with the earliest version, though it might be tricky to differentiate between built-in modules and modules that are external or something in between like ElementTree.
The result should be an overview of versions that support the syntax of the code and an overview of the modules and which version (if at all) is needed to use it. With that result you could calculate the best lowest and highest version.
The tool pyqver from Greg Hewgill wasn't updated since a while.
vermin is a similar utility which shows in the verbose mode (-vvv) what lines are considered in the decision.
% pip install vermin
% vermin -vvv somescript.py
Detecting python files..
Analyzing using 8 processes..
!2, 3.6 /path/to/somescript.py
L13: f-strings require 3.6+
L14: f-strings require 3.6+
L15: f-strings require 3.6+
L16: f-strings require 3.6+
print(expr) requires 2+ or 3+
Minimum required versions: 3.6
Incompatible versions: 2
Bonus: With the parameter -t=V you can define a target version V you want to be compatible with. If this version requirement is not met, the script will exit with an exit code 1, making it easy integratable into a test suite.