This question pops up quite often, and with the exception of PyPy’s sandboxing feature (whose security I can’t personally vouch for, though it was designed by some very smart people), the majority of “Python level” solutions look no less porous than Swiss cheese. So here I’ll demonstrate one approach to sandboxing arbitrary untrusted code that is easy to understand, and leverages a very mature yet much underappreciated Linux kernel feature called seccomp. This feature is the juice behind Google Chrome’s security architecture on Linux, although its heritage stretches back much further.
Once seccomp is engaged, in the default mode the calling process can no longer invoke any OS service except the read(2), write(2), and _exit(2) system calls. That means no matter what code is running inside the process, no interactions with the external system are possible except by it committing suicide, or by reading from or writing to existing file descriptors.
Using a parent process and a very simple SecureEvalHost class, we can arrange to fork a child that will close any file descriptors except a special command pipe shared with the parent, set process limits to prevent infinite loops from consuming 100% CPU, engage seccomp, and finally enter a loop that reads untrusted code from the command pipe, executes it, and writes the execution result back to the command pipe.
Since no filesystem access is possible once seccomp is engaged, any modules required by the untrusted code must be fully loaded before forking. Finally, and a huge warning: the untrusted code will have access to any in-memory state of the parent process prior to fork. This means if your host has confidential code loaded, it might be exfiltrated via an untrusted script (if not the source — which is not in memory (except perhaps in free’d regions), then at least the in-memory code objects and data). A simple variant of the demo below (that uses an intermediary exec’d proxy script) would mitigate the exfiltration problem.
Code here (depends on cffi and python-prctl packages)
There are a few problems with this example as it stands: you could consider it a feature or a bug, but if Python asks malloc for more memory, and malloc in turn needs to ask the system for that memory, then the process will be killed. This can be fixed by setting a relevant process limit, then using seccomp’s “filtered” mode, which allows selective access to e.g. sbrk(2) or mmap(2).
Another problem is insufficient validation on the result produced by the child: at the very least, we need to verify the size of the encoded message is less than 4GIB! (the maximum size of the 32bit unsigned int used to represent the size)
A final note: the choice of JSON as the message encoding is not incidental! The Python “pickle” encoding allows trivial remote code execution, so using it as the inter-process encoding would introduce a rather large security hole.
Finally some timeit results:
$ python -mtimeit -s 'import seccomp' 'seccomp.go()'
1000 loops, best of 3: 810 usec per loop
So despite this method’s size and conceptual simplicity, it’s still sufficient to process nearly 1200 mutually untrusted pieces of code per second per core (and much higher if the child can be reused). That certainly beats any other approach I’m aware of.
With a bit of gdb/lldb scripting, it is possible for the parent to inspect a KILL’d zombie, and extract the Python-level stack trace that resulted in the security violation. Combined with some design and basic infrastructure, that’s about the only missing diagnostic mechanism required to turn a variant of this into a powerful little App Engine-alike.
Edit: oops, forgot to mention: you must absolutely ensure the child does not inherit any privileged kernel objects! For example, a writeable memory mapping. Before employing a solution like this (and continually for any future change), you must ensure none of the child’s code, the child’s loaded modules’ code, the parent’s code, or e.g. any init scripts, LD_PRELOAD libraries, libc Name Service Switch plugin, or process supervisor leaks any such object. In practice this is quite simple to audit, but you *must* check!