Back in the (e)game
(This post is part of a series on the subject of my hobby project, which is recreating the C source code for the 1989 game F-15 Strike Eagle II by reverse engineering the original binaries.)
This is just a short status update to let anybody interested know that after spending too much time on improvements and bugfixes for my tooling, I’m back to reconstructing the actual game, finally branching into the next executable (EGAME.EXE
) since the previous one is mostly working. In fact, I just finished reconstructing the main()
routine, so I guess this is cause to celebrate. Yay!
Part of the reason this took so long is the fact I was procrastinating because setting up a new executable with my tooling is a bit of a chore. What I basically did was:
- Initial research in and around
main()
in IDA, named some routines and variables including the overlay driver trampoline routines, exported the listing - Created a config file (
egame_rc.json
) for my listing-slicing tools, telling it to tweak segment definitions, export public symbols etc. in the autogenerated files (C header and assembly file holding the not-reconstructed routine stubs as well as the contents of the data segment) - Created a new Makefile target for building
EGAME.EXE
and its dependencies, including the autogenerated files - Wrote the code for
main()
using the IDA listing as guidance - Ran the build and automated comparison with
mzdiff
to the original, tweaked until identical code was emitted
The good news is that now that all that’s done, reconstructing subsequent routines will be much easier, and I’m sure the reconstruction work will see progress, even if it may take a while before it’s complete.
Here is the output from the diffing tool with some statistics at the end:
ninja@RYZEN:f15se2-re$ make verify-egame mzretools/debug/mzdiff bin/egame.exe:0x10 build/egame.exe:[558bec83ec??c746] --verbose --loose --ctx 30 --map map/egame.map Comparing code between reference (entrypoint 0000:0010/000010) and target (entrypoint 0000:0010/000010) executables New comparison location 0000:0010/000010, queue size = 0 --- Now @0000:0010/000010, routine 0000:0010-0000:0146[000137]: main [near], block 000010-000146[000137], target @0000:0010/000010 0000:0010/000010: push bp == 0000:0010/000010: push bp 0000:0011/000011: mov bp, sp == 0000:0011/000011: mov bp, sp 0000:0013/000013: sub sp, 0x6 =~ 0000:0013/000013: sub sp, 0x4 0000:0016/000016: mov word [bp-0x02], 0x0 == 0000:0016/000016: mov word [bp-0x02], 0x0 [...] Reached end of routine block @ 0000:0146/000146 Completed comparison of routine main, no more reachable blocks New comparison location 0000:0688/000688, queue size = 13 --- Now @0000:0688/000688, routine 0000:0688-0000:06e0[000059]: routine_14 [near], block 000688-00069a[000013], target @0000:015f/00015f0000:0688/000688: push bp != 0000:015f/00015f: ret ERROR: Instruction mismatch in routineroutine_14 at 0000:0688/000688: push bp != 0000:015f/00015f: ret [...] --- Routine map stats (static): Routine map of 400 routines covers 71988/0x11934 bytes (42% of the load module) Reachable code totals 71773/0x1185d bytes (99% of the mapped area) Unreachable code totals 1494/0x5d6 bytes (2% of the mapped area) Excluded 63 routines take 4809/0x12c9 bytes (6% of the mapped area) Reachable area of excluded routines is 4986/0x137a bytes (6% of the reachable area) --- Comparison run stats (dynamic): Seen 2 routines, visited 2 and compared 311/0x137 bytes of opcodes inside (0% of the reachable area) Ignored (seen but excluded) 0 routines totaling 0/0x0 bytes (0% of the reachable area) Practical coverage (visited & compared + ignored) is 311/0x137 (0% of the reachable area) Theoretical(*) coverage (visited & compared + reachable excluded area) is 5297/0x14b1 (7% of the reachable area) Missed (not seen and not excluded) 335 routines totaling 66779/0x104db bytes (92% of the covered area) (*) Any routines called only by ignored routines have not been seen and will lower the practical score, but theoretically if we stepped into ignored routines, we would have seen and ignored any that were excluded. DEBUG: Dumping visited map of size 0x7169 starting at 0x0 to tgt.visited Building code map from search queue contents: 15 routines over 2 segments Saving target map to map/egame.tgt Saving code map (routines = 15) to map/egame.tgt, reversing relocation by 0x0000 Comparison result: mismatch make: *** [Makefile:256: verify-egame] Error 1
So, as expected, main()
was matched, and the comparison failed on routine_14
which I need to do next. You can see that I have completed 0% of the reconstruction, although if we count ignored routines (libc and such) as completed, the completion stat climbs to 7%, so I have about 92% to go.
Additional good news is that the duplicate search functionality developed in the tooling was not a total waste, even if the results were not earth-shattering. Upon repeating the signatures search from START.EXE
for EGAME.EXE
with improved routine boundaries and smaller routines included, it detected 11% of EGAME.EXE
as duplicate of code that I already reconstructed before, although that does include the libc functions, so it’s really just 11-7=4%. Together with the lousy 3% it found as coming from the leaked Fleet Defender codebase, that’s 4+3=7% that I don’t need to do, or at least not from scratch. It’s a rough approximation, but you could say I’m about 7(duplicates from START.EXE
and Fleet Defender)+7(libc)=14% done without really doing much. 😉 I’m pretty sure some routines will turn up as unreachable, same as it was with START.EXE
, so that could further limit the extent of the reconstruction. But there’s no way around it, the bulk of the work is still ahead of me. Still, it’s not as daunting as first starting out because I know much more about how the game works, I have the layout of some common structures and overlay call jump table down, so it’s “just” a matter of going through all the opcodes and writing the correct C code.
This is it for now, I’ll update when I come across something interesting, or if I hit a significant milestone.