Automated Key Generator Generation

Originally published on March 6th, 2012 on OpenRCE, under the overly-academic title "[video] Semi-Automated Input Crafting by Symbolic Execution, with an Application to Automatic Key Generator Generation"

The problem of input crafting can be stated as follows:  given the ability to influence some aspect of a program's execution (say, by supplying it with a hand-crafted file, network input, sequence of keystrokes, etc), we want to know whether it is possible to cause the program to reach some state, for example, a condition where the integer argument to some memory allocation function is zero, a specific sequence of branches being taken causing EIP to obtain a certain value, an array dereference whose index is not within its bounds, etc.  Much work in reverse engineering for vulnerability analysis and software cracking reduces to this problem; some malware analysis problems can also be stated in such terms.

It is well-known in the formal verification literature that symbolic execution, a method for reasoning about program executions based upon formal semantics and theorem proving, may be applied towards this problem.  However, actual tools that can be used in a friendly fashion have not yet materialized in the hands of the common, working reverse engineer.  This blog entry demonstrates a prototype of a tool of this nature.

Crafting inputs with manually-generated SMT instances

We take as our example program a crackme called Kao's Toy Project, which recently attracted the scrutiny of Dcoder and andrewl, two clever computer-hacking mathematician types.  Note:  I would have used a vulnerability analysis example since it might be more compelling to a general audience, however, the notion of producing a reverse engineering video involving commercial software is not a palatable one for legal reasons, and therefore I chose a program that was deliberately designed to be reverse engineered.

The crackme implements the following scheme.  When presented with a 32-byte activation code, the user enters a hexadecimal string such as "01234567-76543210", which is then decomposed into two dwords, one of which is XORed with the other, and the following compact loop is executed:

.text:004010F7 mov     esi, offset a_ActivationCode
.text:004010FC lea     edi, [ebp+Output]
.text:004010FF mov     edx, [ebp+arg_0__SerialDwLow]
.text:00401102 mov     ebx, [ebp+arg_4__SerialDwHigh]
.text:00401105
.text:00401105 compute_output:
.text:00401105 lodsb
.text:00401106 sub     al, bl
.text:00401108 xor     al, dl
.text:0040110A stosb
.text:0040110B rol     edx, 1
.text:0040110D rol     ebx, 1
.text:0040110F loop    compute_output
.text:00401111 mov     byte ptr [edi], 0
.text:00401114 push    offset String2                  ; "0how4zdy81jpe5xfu92kar6cgiq3lst7"
.text:00401119 lea     eax, [ebp+Output]
.text:0040111C push    eax                             ; lpString1
.text:0040111D call    lstrcmpA

Given that the user's input consists of two dwords, the key space is 2^64.  Dcoder and andrewl offer an observation that halves the exponent to 2^32.  Dcoder further offers an algebraic cryptanalysis of the scheme that obliterates it.  andrewl chose to explore the route of modelling the scheme in terms of the Boolean Satisfiability (SAT) problem, and then feeding the result into a SAT solver.  The resulting SAT instance is solved more or less immediately.

We followed andrewl's line of reasoning and manually constructed a representative Satisfiability Modulo Theories (SMT) formula.  (To be technical, the formula is a sentence in a quantifier-free fragment of first-order logic restricted to the theories of equality, bitvectors, and extensional arrays.)  The formula does not explicitly take advantage of any aforementioned, known flaw in the cryptosystem, and instead encodes the algorithm literally.  We believe that the reader will find the resulting SMT formula very clear and easy to understand.  This is an advantage over a SAT instance, which is virtually incomprehensible and harder to debug.  The SMT instance is a bona-fide key generator, and furthermore the theorem prover is able to solve the formula in milliseconds (considerably faster than a brute-force search over a 64-bit keyspace).

We outline the SMT instance:  Line 1 sets the theory. Lines 3-7 declare the variables of interest, as well as the array sort. Lines 9-40 encode the serial algorithm. Lines 42-73 state that the output array must equal the fixed value given in the crackme. Lines 75-106 encode the activation code that was given by a given instance of running the crackme. You can replace these lines with whatever your own activation code was, thereby creating a key generator. Lines 108-110 encode the additional constraints where one of the serial dwords was XORed with the other. Lines 112-113 query the decision procedure and output the results.  After installing Z3 (other theorem provers such as Yices are also suitable), one solves the formula as follows:  C:\Program Files (x86)\Microsoft Research\Z3-3.2\bin>z3 /m /smt2 \temp\kao.smt.

Generating the constraint system automatically, directly from the binary

This is all well and good, but manually encoding algorithms as SMT instances is tedious business.  Instead, we seek a solution that can automatically generate the relevant constraint system directly from the x86 binary.  The current solution is a static analysis (i.e. it does not alter or observe the execution of the program).

The algorithm proceeds in three main phases:

1)  Trace generation.  Since the algorithm is static, the analyst manually specifies some parameters to the system:  namely, an address from which to start the analysis, a model of the initial values of the registers and memory, and some condition that dictates when the analysis should stop.  (Note that, if we were to reformulate our analysis in a dynamic setting, many of these considerations would be superfluous.)  The tool then statically emulates the program, generates a list of instructions encountered during the emulation, and converts them into an intermediate representation.

2)  Trace simplification.  We apply other static analyses against the trace to simplify it, which could potentially speed up the solving.  Analyses could include constant propagation, dead statement elimination, or the abstract interpretation that I published last year.  This step is actually optional in the sense that the simplifications preserve satisfiability.

3)  Constraint generation and solving.  With our trace in hand, we transform it into an SMT instance and feed the resulting equations into a theorem prover.  Additionally, the user supplies some postcondition that corresponds to the state that he or she wishes for the program to enter.  We then check satisfiability, and if the formula is indeed satisfiable, the theorem prover can furnish a model of the inputs that cause the desired state to be reached.

With all of that out of the way, here is a link to the actual video depicting the process.  I apologize if the video is tedious (I am tedious) and for the stupid manner in which I sound (I am stupid, and also I am from the American south).  The files referenced in the video, kao-1.ml through kao-3.ml, are linked to above in the description of the algorithm.

Key-generator generation

Given the construction laid out in the video referenced above, generation of a key generator is now trivial.  Instead of asserting in the postcondition that the activation code is fixed to whatever value was observed in the construction of the constraint system, we allow the user to enter his or her own activation code, which we then assert as the initial values of the activation code array.  We then solve the system as described in the video and part 3 of the code linked above, and provide the user with the proper registration code (EBX and EDX values).

Postscript

An astute proofreader wondered why, during the video, when I solved the manually-constructed instance with Z3, it reported an error about a model not being available.  In fact, the formula solved correctly (as evidenced by the output saying "sat"), but I did not specify the "/m" flag on the command line due to it skewing the time statistics, which resulted in that message being produced.  The reader can manually verify that the formula is correct with the command line given previously.