(This post is part of a series on the subject of my hobby project, which is recreating the C source code for the 1989 game F-15 Strike Eagle II by reverse engineering the original binaries.)

I realize this might be a little controversial, but I wanted to share a bit of my perspective on the usage of LLMs in a project like this. I must admit I was initially pooh-poohing the insane rate of progress achieved by @AJenbo with them, thinking sure, they can figure out how to do this from all of this research and code I did upfront, but surely I am the Real Deal, only I carry the wisdom, and I will be the final oracle to consult when inevitably the dumb machines reach the limit of their context windows, haha. I think this is a common sentiment among IT professionals from reading comments on technology forums and such. I’m still not sure how all of this will play out in the broader context of the use of LLMs in software development. For sure, there is a lot of grift, greed, rabid hype, lies and outright scams in the industry around “AI” while everybody is trying to make a buck. But I stopped thinking I was smarter than an LLM. Let me show you why with a couple of examples from the C code reconstruction of the largest of the 3 executables that make up this game, egame.exe.

A small thing to break your brain

We’re in the routine computeHudAttitude. A wee bit of assembly doesn’t want to cooperate:

sub ax, ax
sub ax, [0x581c]
mov [0x581c], ax

How hard can it be? Clearly it’s just var = -var. But just hold on cowboy. The generated assembly is this instead:

mov ax, [0x581c]
neg ax
mov [0x581c], ax 

Stuff like this always smells of signedness, so I tried flipping it with no success. Then the obvious var = 0 - var, and then every possible stupid way to write this, including insane casts and questionable stuff like (var ^ var) - var, -(&var)[0] and a ternary expression (don’t ask). Changing optimization flags didn’t work. Nothing worked.

Then collaborator @xor2003 got this out of his LLM:

var = 0x10000 - var;

…the constant value is not even 16bit… but I guess it would be 0xffff+1 which overflows to zero.

Any questions? No? That’s fine, I don’t have any either. Just want to crawl under a rock.

Make double sure this var is this var

This happened in a routine that’s called fireAirThreat today, and now looks much better than this, but this is what we were working with originally:

    i = *(int16 *)&stru_3B202[param_1].state[14];

    if (g >> 1 < *(uint16 *)&sams[i].field_8 &&
        (unsigned)(-(word_330B8 * 3 - 0x10)) < g &&
        g < 0x1000 &&
        i != 0) {
        /* launch missile into slot j */
        stru_335C4[j].mapX = stru_3B202[param_1].posX;
        stru_335C4[j].mapY = stru_3B202[param_1].posY;
        stru_335C4[j].alt = stru_3B202[param_1].alt - 0x19;
        stru_335C4[j].field_6 = sams[i].field_A >> 6;

Started simple enough with code out of the LLM mostly matching, but a different register was being used around an access to a struct member:

0000:767a/00767a: mov ax, 0x24                     == 0000:4b74/004b74: mov ax, 0x24 ; sizeof(stru_3B202)
0000:767d/00767d: imul [bp+0x04]                   == 0000:4b77/004b77: imul [bp+0x04] ; multiply by index (param_1)
0000:7680/007680: mov si, ax                       == 0000:4b7a/004b7a: mov si, ax ; save for later
0000:7682/007682: mov ax, [si-0x7690]              =~ 0000:4b7c/004b7c: mov ax, [si-0x737e] ; load value of specific member
0000:7686/007686: mov [bp-0x14], ax                == 0000:4b80/004b80: mov [bp-0x14], ax ; store value in stack variable `i`
0000:7689/007689: mov ax, 0x12                     == 0000:4b83/004b83: mov ax, 0x12 ; load sizeof(sams)
0000:768c/00768c: imul [bp-0x14]                   == 0000:4b86/004b86: imul [bp-0x14] ; mutliply by `i`
0000:768f/00768f: mov bx, ax                       != 0000:4b89/004b89: mov di, ax ; save for later... whoops!

The part leading up to the mismatch deals with the line i = *(int16 *)&stru_3B202[param_1].state[14], and shows a familar pattern I’ve seen with MSC many times before when accessing a member in an array of structs: it loads the constant size of the struct (0x24) into a register, multiplies it by the index variable (bp+0x4), then saves the value in register si because sizeof(stru_3B202) * param_1 will be reused in subsequent accesses to this array element. Next we want to do the same for the comparison inside the if, but for sams[i]. Except that when the compiler tries to likewise save sizeof(sams) * i into di for reuse, this does not match the reference, which loads the value into bx instead, seemingly oblivious to the possibility of reuse. Could this be unoptimized code? I tried moving the routine to a file that builds without optimizations, but curiously enough, it didn’t make any difference, and the function looked basically identically, which is also pretty surprising.

I kept banging my head against the wall and tried eveything I could think of, but I could not come up with a way to make the compiler more stupid. Instead, it was making a fool out of me. Again, it was @xor2003 whose LLM came up with this:

    if (sams[i].field_8 > (g >> 1)) {
        if ((unsigned)(-(word_330B8 * 3 - 0x10)) < g) {
            if (g < 0x1000) {
                if (i != 0) {
        /* launch missile into slot j */
        i = i;
        stru_335C4[j].mapX = stru_3B202[param_1].posX;
        stru_335C4[j].mapY = stru_3B202[param_1].posY;
        j = j;
        param_1 = param_1;
        stru_335C4[j].alt = stru_3B202[param_1].alt - 0x19;
        stru_335C4[j].field_6 = sams[i].field_A >> 6;

I’m writing this up a while after the fact, so I don’t recall now if breaking up the &&s into nested conditions was absolutely necessary, but I’ve seen elsewhere that opening up a new condition made the compiler lose its memory of the values involved in the conditionals and forced it to recalculate values that it already had in registers, so it’s possible. But the definite missing piece were the insane assignments of variables to themselves. I feel like I wouldn’t have come up with this in a million years, but the LLM somehow pulled it out of a hat.

Now register, now you don’t

This comes from drawProjectionSphere. We have a loop filling an array from other arrays:

    a = 0;
    do {
        f[0] = b[a];
        f[1] = c[a];
        f[2] = d[a];
        f[3] = e[a];
        f[4] = d[a + 1];
        f[5] = e[a + 1];
        f[6] = b[a + 1];
        f[7] = c[a + 1];
        drawPolygonOutline(word_3298A, 4, f, a + 0x60);
    } while (++a < 16);

I needed to get these instructions:

mov word [bp-0x04], 0x0 ; a = 0
mov si, [bp-0x04] ; load `a`
shl si, 1 ; multiply by 2 to get offset in array of int16s
add si, bp ; rebase the offset to the start of the stack frame
mov ax, [si-0x26] ; load the value from the specific stack variable
mov [bp-0x009c], ax ; f[0] = b[a]

But again I kept getting a different register, ax, with the extra complication of this value getting saved onto the stack for no apparent reason:

mov word [bp-0x04], 0x0
mov ax, [bp-0x04] ; `a` loaded into `ax` instead
shl ax, 1
add ax, bp
mov [bp-0x00a6], ax ; register spill of ax onto stack? never actually reused
mov bx, ax ; now it goes into `bx`? okay...
mov ax, [bx-0x26] ; stack variable addressed through bx
mov [bp-0x009c], ax

The fact that the stack is addressed through bx without a segment prefix might be surprising because the default segment register for that is ds, but remember this is the small memory model, so ds=ss, and all is well. In any case, this was happening in optimized code, in a relatively complex routine with repeated almost identical blocks of code, and I’ve seen code deduplication performed by this compiler in the past, so I was having a bad feeling looking at this. I’m not going to paste the entire routine in here, but the crux of the issue was that it smelled of some variables being declared as register (because they went into si and di), except they did not behave this way everywhere, oh no. Only in some places, and elsewhere it was as if the register declaration disappeared. How could a variable have seemingly two different definitions inside one routine? This time it took @AJenbo armed with an LLM to figure it out:

void drawProjectionSphere(int arg_0)
{
    // register int i, j ❌❌❌ removed

    // [...]
    if (*(char *)&word_38FDC < 3) {
        sub_1FEEC(arg_0);
        return;
    }
    {
        register int i; // ✅ placed in nested scope instead
        a = 0;
        do {
            i = a + a;
            *((int *)((char *)&word_3BE9C + i)) = *((int *)((char *)&word_32990 + i));
            a++;
        } while (a < 16);
    }
    word_38FC6 = -var_226;

Turns out the key to solving it was a bunch of lines above the part I was trying to force to match. By removing the register variables from the function scope, and introducing nested scopes with these variables, it made the si and di registers available outside those scopes, which freed the problematic section to use si. Kind of obvious in hindsight, but I don’t know how long I would have had to brain at this to come up with nested scopes.

And now for something completely different

The routines above were ones we originally though were the last missing ones, and solving them felt doubly important because for one we cleared huge obstacles, and two, we thought we were wrapping up the reconstruction of egame. But a little bit later it turned out some C code was still hiding in the executable, so it was back to the drawing board and celebrate completion again. But bottom line, we had our precious source code written out in full.

Now, one of the interesting questions was, if we rebuild all of it with maximum optimizations, will it run? Surely it will run faster?

Well, no. And no. It does not run at all. @AJenbo figured out it was due to some optimizations in a routine having to deal with timer-related variables that changed their values outside of the normal flow of C code. You know, the kind of stuff that you solve with volatile. Except that MS C 5.1 does not support volatile. Or rather, if memory serves, the docs say something to the effect of it is supported “syntactically, but not functionally”. Which I guess is a smart way to say it doesn’t throw an error, but doesn’t really do anything either.

We’re guessing that this could be the reason that some parts of the game were built with the debug mode /Zi flag - Microprose was building the game in debug mode during development and it worked, but when they tried to build it in turbo mode for release, it broke and they had to ship, so they went back to /Zi even though it meant slower code. But the release deadline was probably impending, so they left it at that. Guess it’s a lucky thing for us because all that debug code was so much easier to untangle than optimized code.

And with this final example of human fallibility, I leave you tonight, dear reader. Sleep well, but know that the LLMs don’t sleep. 😈