More progress on EGAME
(This post is part of a series on the subject of my hobby project, which is recreating the C source code for the 1989 game F-15 Strike Eagle II by reverse engineering the original binaries.)
I am happy to say the reconstruction for EGAME.EXE
is progressing smoothly. I was planning to do an update after reaching 10% completion, but I work in increments of routines, and finishing the previous one only took me past 9%, so I needed to do one more before celebrating. It was a pretty hefty one (sub_18E50
, at least called that for now), so I’m happy to report we are actually sitting around 14%, with about 10k of opcodes transcribed into C and 52k to go:
--- Routine map stats (static): Load module of executable is 167792/0x28f70 bytes Routine map of 400 routines covers 71986/0x11932 bytes (42% of the load module) Reachable code totals 73063/0x11d67 bytes (101% of the mapped area) Unreachable code totals 202/0xca bytes (0% of the mapped area) Excluded 122 routines take 5104/0x13f0 bytes (7% of the mapped area) Reachable area of excluded routines is 5281/0x14a1 bytes (7% of the reachable area) --- Comparison run stats (dynamic): Seen 82 routines, visited 49 and compared 9662/0x25be bytes of opcodes inside (13% of the reachable area) Ignored (seen but excluded) 33 routines totaling 1122/0x462 bytes (1% of the reachable area) Practical coverage (visited & compared + ignored) is 10784/0x2a20 (14% of the reachable area) Theoretical(*) coverage (visited & compared + reachable excluded area) is 14943/0x3a5f (20% of the reachable area) Missed (not seen and not excluded) 229 routines totaling 52517/0xcd25 bytes (72% of the covered area) (*) Any routines called only by ignored routines have not been seen and will lower the practical score, but theoretically if we stepped into ignored routines, we would have seen and ignored any that were excluded.
Mind that some of that 52k will turn out to be assembly so not target for the reconstruction per se, but will need porting to C anyway. So ultimately, more work, just later.
Seeing as it’s been almost exactly a month since I’ve started actually transcribing the code (it was a lot of preparatory work to get the new executable building in my framework before that), that means that I’m roughly able to do 10k in a month, so theoretically if I went all in, I could probably finish the transcription within 5-6 months. Incidentally, this 10k/month lines up with what Fabien Sanglard based his calculations on when considering whether reversing Strike Commander would be possible, before deciding it would take too many years. I think he might have been a bit pessimistic, because of those binaries’ size, a lot is bound to be data, some might be libc… But I’m still happy my game is smaller. 😁
So, will EGAME
be reconstructed around October? I really doubt it, it’s probably going to take a year or more. I actually plan to take it easy for a while, with summer around the corner and more family functions and activities that inevitably brings. I also want to play a few games that I bought on Steam a while back but didn’t even have time to check out. I’m not likely to stop completely, I’m just managing expectations (my own, mostly 😉) to clarify that I’m not going to be able to maintain this pace. But the future is looking bright, so don’t fret.
With that out of the way, let’s look at some interesting code snippets that have popped up while doing the reconstruction.
[bx+si]
I found this in the routine which loads .3dt
(terrain) files. I don’t understand the format for now, but these functions will go a long way towards figuring it out one day.
mov ax, [bp+var_8]
; ...
mov si, [bp+var_4]
mov cl, 6
shl si, cl
mov bx, ax
shl bx, 1
mov ax, [bp+var_6]
add ax, offset buf1_3dt
mov word_1234[bx+si], ax
mov [bp+var_A], 0
After putting one index into si
and another into bx
, the code reaches into what looks like an array of integers using the [bx+si]
memory addressing mode. But why didn’t it just build the index in one register? I tried various ways of writing the indexing expression before realizing this is actually a 2-dimensional array. The first shift by 6 is multiplying the column value in var_4
by 64, which is the size of the row (32) times the size of the element (uint16: 2), then another shift on the row index multiplies it by 2 to get the final offset. It’s actually lucky that it uses a different indexing mode for matrices, else I might have had difficulties recognizing it as such. Now I can rename and declare word_1234
as something more meaningful:
extern int matrix3dt_2[5][32];
void load3DT(char *arg_0) {
// ...
matrix3dt_2[var_4][var_8] = (int16)(buf1_3dt + var_6);
// ...
}
sub-sbb-and-add-and-…what?
This time we is in the .3d3
file loading routine, which I suspect are 3d models. Either way:
mov ax, size3d3_2
sub ax, 800h
sbb cx, cx
and ax, cx
add ah, 8
mov [bp+var_A], ax
We’ve seen sub-sbb-neg
used by the compiler before as a way to perform branchless NULL
checks, so that’s a hint it might be trying something similar. I’m using my old trusted technique of plotting the values for the significant cases. Here, it’s clearly trying to compare something against 0x800
, so let’s pick one value below, and one above it:
Instruction | Value (ax=0x1234) | Value (ax=0x200) |
---|---|---|
mov ax, size3d3_2 | ax=0x1234 | ax=0x200 |
sbb cx, cx | cx=0 | cx=0xffff |
and ax, cx | ax=0 | ax=0xfa00 |
add ah, 8 | ax=0x800 | ax=0x200 |
Seeing the values makes it crystal clear that it’s just clamping the value to the range 0-0x800
. This simple code matches the binary arithmetic mess exactly:
var_A = (size3d3_2 >= 0x800) ? 0x800 : size3d3_2;
I really need a clever heading for this
Still in the .3d3
routine:
mov ax, [bp+var_A] ; part of an earlier calculation
add [bp+var_E], ax
cmp [bp+var_10], 0
jnz short loc_12C56
mov si, size3d3
shl si, 1
add ax, buf3d3[si] ; new calculation reuses value of var_A already in ax
mov (buf3d3+2)[si], ax
The code is pretty simple:
var_E += var_A;
if (var_10 == 0) {
// 2c52
buf3d3[size3d3+1] = buf3d3[size3d3] + var_A;
} // 2c56
…but the question is how to get the value of stack variable var_A
in register ax
from before the conditional jump to propagate to the addition operation within the conditional code. I tried in vain, but the compiler would reload the value of buf3d3[si]
into ax, and add var_A
to it.
I really don’t remember too well how I got the idea, but when all else fails, try flipping signedness. Changing the declaration of buf3d3
into extern unsigned int buf3d3[]
solves this one, but don’t ask me why.
Haven’t had enough binary arithmetic magic yet?
This routine ostensibly processes the cases for view switching in the plane:
mov ax, word_330C4
inc ax
cwd
sub ax, dx
sar ax, 1
mov cx, word_336E8
sub cx, ax
dec cx
and cx, 0Fh
mov [bp+var_E], cx
After seeing this, I got a bit nauseous, so decided it was time I had a stimulating conversation with my pal ChatGPT. I mean, obviously we’re trying to do some operation on a doubleword, but why subtract the older half from the younger? Anyway, they told me that:
sub ax, dx This is the interesting part. Since dx is either 0x0000 (for positive ax) or 0xFFFF (for negative ax), the subtraction effectively does: If ax was non-negative (dx = 0x0000), then ax remains unchanged. If ax was negative (dx = 0xFFFF), then ax = ax - (-1) = ax + 1. This effectively cancels the inc ax instruction for negative values. This adjustment ensures that the rounding behavior for division by 2 is more symmetric. Normally, integer division truncates toward zero, but this modification makes negative numbers round more correctly toward the mathematical floor. Without sub ax, dx, a negative odd value would round incorrectly due to simple truncation.
Are they right? Who knows, right? It sounds so smart that I’m inclined to believe it. 😉 In any case, my buddy kindly made a table of values just like I enjoy, and it looks like it really is just a way to get division with sar
to line up – I should really know better by now that ax:dx
does not always equal doubleword, sometimes it’s just plain division. There’s really nothing interesting to the matching code:
var_E = (word_336E8 - ((word_330C4 + 1) / 2) - 1) & 0xf;
Divsion came knocking again
I don’t even know what this routine does (yet). Within, we have this function call with a conditional in the middle of the arguments getting pushed onto the stack.
mov ax, 0Fh
push ax
mov ax, 36h ; '6'
push ax
mov ax, 0E4h ; '�'
push ax
cmp word_380D0, 64h ; 'd'
jnb short loc_192D5
mov ax, word_380D0
jmp short loc_192E7
loc_192D5:
mov ax, word_380D0
sub dx, dx ; extend into dx
mov cx, 5 ; ...divide by 5
div cx
mov cx, ax
shl ax, 1
shl ax, 1 ; ...multiply by 4
add ax, cx ; ...add one more 🎵
loc_192E7:
push ax
call sub_1A183
The troublesome bit is the add ax, cx
. For some reason, I kept writing it as:
sub_1A183(word_380D0 < 0x64 ? word_380D0 : (word_380D0 / 5) * 4 + (word_380D0 / 5), 0xe4, 0x36, 0xf);
This does not match, the division is repeated, before adding the value. It took a while to click: multiply it by 4, add one more time makes 5!
sub_1A183(word_380D0 < 0x64 ? word_380D0 : (word_380D0 / 5) * 5, 0xe4, 0x36, 0xf);
This is as match. Looks dumb mathematically which is why I initially rejected it, but quickly remembered this is binary division, so this will strip the remainder off the value, performing rounding to a multiple of 5, essentially the equivalent of word_380D0 - (word_380D0 % 5)
.
Relax, that was the last one
I wanted to conclude this entry with some general discoveries. Mind that I’m not only doing the reconstruction, but investigation in IDA to properly mark variables, add declarations for automatic C header generation etc. Usually, I try not to follow the rabbit holes too deep and focus on the routine that I’m currently looking at, but sometimes a piece of information is missing, and I need to search elsewhere. At some point, I was trying to figure out the layout of some struct data, but the code in the current routine was only accessing two members out of 8, so I had to cast a wider net. This way, I finally found myself in this interesting code:
; ...
loc_1D747:
cmp ax, 266Ch
jnz short loc_1D74F
jmp keyL_1D31D
loc_1D74F:
jbe short loc_1D754
jmp loc_1D7EE
loc_1D754:
cmp ax, 1970h
jnz short loc_1D75C
jmp keyP_1D605
loc_1D75C:
ja short loc_1D79E
cmp ax, 1177h
jnz short loc_1D766
jmp keyW_1D5AF
loc_1D766:
ja short loc_1D77B
cmp ax, 11Bh
jnz short loc_1D770
jmp keyEsc_1D6B6
loc_1D770:
cmp ax, 0E08h
jnz short loc_1D778
jmp keyBkspc_1D641
; ...
This is a long series of checks of ax against seemingly arbitrary values which was orignally probably a switch
statement, but I realized these were key scan codes. This is the routine for dispatching keypresses! This is immensely helpful, because I know what the keys do in the game, so I can infer (even if broadly) the purpose of the routines that are invoked by the cases. I did some initial poking around and found a bunch of interesting avenues for further research. As it is with IDA, figuring out one bit in one place will sometimes unlock whole other areas for you, and you keep on doing that until you’re done.
Fat routines
I wanted to talk about are the sizes of the routines I’ve been encountering. Some are pretty significant. Last one I finished was almost 1600 bytes of code, which is not dramatic, but not trivial either, especially as it has multiple nested conditions and loops inside, which reminds me about how I used to need to do desperate stuff to get the control flow down. The current routine that I am supposed to do is over 4700 bytes! It’s one of the key dispatch routines I was excited about (there are actually two separate ones, not sure why yet). Anyway, this hints at maybe why why could not find many routines from Fleet Defender in the F15SE2 code - a lot of this code appears to be huge, sprawling routines that cover many aspects of the game logic. Perhaps at some point there was a refactor of this codebase into more manageable, smaller bits, which would not be found with routine signatures, and would be little help in reconstructing the code even if we could find them. Perhaps the old, ugly code was decided to be too F15-specific, so it was thrown away at some point in time of the codebase’s lifetime as cruft, and did not carry over to either F15SE3, or some other step along the way. I’ll probably never know, but this kind of makes sense and closes down the quest for duplicate searching against the F-14 code leak.
Transcription 101
By the way, how am I managing to write all this convoluted code without getting confused? Recently, while doing the reconstruction, I started following a sort-of formalized approach to transcribing the code. It’s a pretty small detail, but enough of a gamechanger for me that I wanted to mention it. Basically, while working on a routine, I will write out hex offsets of the corresponding assembly opcodes as comments in the source code. Not for every line, but it’s very helpful if I ever need to go back from an offset to a place in the C code, which pops up pretty often. Anyway, this isn’t new, I’ve been doing it since the beginning. What I started doing was writing an offset comment on every opening and closing brace, and closing the braces immediately when I encounter them:
int sub_18E50(int arg_0) {
int var_2, var_4, var_6, var_8, var_A, var_C, var_E, var_10, var_12, var_14, var_16, var_18, var_1A;
char var_1C;
byte_3C5A0 = gfx_jump_2d();
var_16 = waypoints[waypointIndex].field_0 - word_3BEC0;
var_1A = waypoints[waypointIndex].field_2 - word_3BED0;
// 8e83
word_3BE92 = sub_1D008(var_16, -var_1A);
if (word_330C2 != 0) { // 8e96
if (word_38FEA != 0) { // 8e9d
word_38FEA = 0;
if (!(keyValue & 0x80)) { // 8eaa
sub_19E44(0xd);
sub_19E5D(0, 0, 0x13f, 0x60);
gfx_jump_4f(0x3c);
} // 🔵 nothing here because it's the same as below, 0x8ed2
} // 8ed2
byte_37C2F = 1;
if (keyValue == 0 && byte_37C24 == 0) { // 8eeb
if (!commData->setupUseJoy) { // 8ef9
sub_19E44(0);
// 🟢🟢🟢 working here
} // 8fce
} // 93c4 // 🔵 have all these marked out in advance
} // 93cf // 🔵
} // 9485 // 🔵
Having these braces fixed at both ends as soon as I encounter the opening one lets me keep my bearings even inside pretty complex control structures. When I see a location in the IDA listing, I can immediately check if it matches some of my currently open blocks’ ending, and hence that I should move out of its scope. Likewise, I have a convention for IDA, where I rename its loc_01234
labels to if/else/endif/loop/..._01234
to make the listing more readable, but that’s not as important in my view, and I only do it where it’s especially difficult to figure stuff out.
This of course assumes unoptimized code without any nastiness like deduplication or code block reordering. But luckily enough, most of the code I’ve seen so far has been compiled with the elusive /Zi flag, so the opcodes pretty much follow the C code one-to-one. But I’m sure something will surprise me one day. Oh well, even in such case following this pattern for as long as possible lets me know something is afoot when the jump sequence does not make sense.
Thanks for reading, I’ll update around the 20-30% mark or if there is anything of interest.