(…continued from Part 1)

Hardware

DOS is running on, and tied to the Intel x86 CPU platform. The lowest common denominator for that platform is the 8086 CPU, which is a 16bit processor with a limited instruction set with a variable-lenght instruction encoding, limited data bus bandwidth, pitiful performance, and a peculiar memory addressing scheme that uses two 16 bit registers to form a 20 bit address. Understanding the operation of the 8086 CPU, the operation and capabilities of its registers, memory models, segmented addressing, and most importantly its assembly language is the single most important skill necessary in reversing an old game like F15 SE2.

Later DOS games embraced the features of more advanced Intel CPUs, and many ran in protected mode under an extender like DOS/4G, which let them access much more memory, use fast 32bit instructions for superior performance, and basically ignore DOS even more than what was considered polite. ;) These games are actually kind of like minature operating systems, and I’m lucky F15 SE2 just uses the lowest common denominator 16-bit 8086 instructions, because I know next to nothing about low-level protected mode programming.

As a consequence of the 8086’s capabilities, memory in real mode is severely limited, you get the famous 640KB because the CPU can address a little over 1MB of memory, and a sizable chunk of that is occupied by regions that are memory-mapped into hardware, and other areas are claimed by the BIOS and DOS itself. In real-world usage, you would typically have some 500KB, give or take, available after loading DOS and some drivers.

The sound and video hardware were usually separate cards inserted into the mainboard, and you would interact with them by a combination of both memory-mapped and port-mapped I/O (with CPU instructions like IN and OUT being reserved for the purpose). All of this was vendor-specific, and good knowledge of particular hardware is necessary to… you get the idea. I know next to nothing about sound hardware, and I get nervous sweats thinking about the brain-dead way the EGA video boards organize their memory into bit-planes that you need to write to in order to get stuff on screen, so I am choosing to ignore sound, and luckily F15 SE2 supports the MCGA graphics adapter (which was more or less equivalent to a VGA card), whose 320x200-sized Mode 13h was relatively painless to work with, and which is the mode the game looks best in anyway, so that’s what I’ll be focusing on.

Tools

A benefit of doing this kind of project in Modern Times(tm) is that we have access to all kinds of spiffy programming tools, and multitasking graphical environments that were not available back in the 80s. Heck, I can even run more than one program at once! That being said, the most important tools are also kind of crusty:

DosBox: I have a love-hate relationship with it. On the one hand, I need some way of running the game, both in the original as well as the reconstructed version, and DosBox emulates a x86 PC running DOS brilliantly for the purpose of running games. On the other, its development has been stagnant around version 0.74 since forever, it’s based on an ancient version of the SDL library which makes it problematic to run on modern hardware (toggle the display to fullscreen and try to guess which monitor it will jump to), it has many annoying bugs especially around input and its mapping, configuring it is a nightmare (I dare you to try to get your game running in the resolution and aspect ratio you want), and the community and especially the developers are a bunch of… not very nice people. ;) Luckily, a brave and beautiful soul has forked it to adapt it to modern systems, but because it’s a RetroArch core and not a standalone program, it means it has limited potential as a reverse-engineering tool (great for actually playing the games though). So, I’m sticking with DosBox, and particularly the debug version of it which provides an extra window where you can step through the game code, view the CPU registers and memory contents, place breakpoints etc. - pretty much everything that’s needed, so I’m almost not mad at how much it sucks anymore. Almost.

IDA: Interactive Disassembler has long been the go-to tool for software forensics, and many successful reverse engineering projects have been completed with its aid. It has more features and supports more platforms than I can count, but for working with DOS binaries, the ancient 5.0 version is supposed to be optimal (I think the vendor dropped DOS support in later versions). It is also nice that they provide a freeware download hosted by the ScummVM project of that version, since a commercial license of IDA costs something like 5 Million $$$ (or so I’m told). The interface is super crusty and it looks as daunting as the cockpit of a space shuttle, but it sure beats working from raw disassembly text once you get the hang of it. IDA is an “interactive” disassembler, which means you load your binary into it, then it analyzes it and spits out dissassembly that is annotated with details that it was able to discern from the code - function boundaries are discovered if possible, stack frame layout might be discovered, code and data segments are separated and (auto-generated) labels placed on referenced locations. You then go through the disassembly and based on what you learn, you assign meaningful names to functions and data, reorder stuff around (with IDA making sure everything stays consistent) etc. and as you do that, your changes propagate throughout the disassembly, so if you discover what a variable referenced in one place is for in one function and rename it to reflect that fact, it will be visible under that name in all other functions, so you kind of keep filling in the blanks to form the full picture, kind of like solving a crossword puzzle, where letters from some words help you out in figuring out other ones.

compiler: As explained in a different post, F15 SE2 is written in Microsoft C 5.0/5.1, so it’s beneficial to use the same compiler for the recreation (althogh not strictly necessary).

misc. others: an unpacker for compressed EXEs, a hex editor, a calculator or a Python prompt for calculating offsets and such, a good code editor and other such nonsense.

mzretools: for reasons explained later, I decided to write my own tool for analyzing DOS executables. It is still a work in progress and doesn’t do much, but I’m hoping to get it to a place where it can be useful in reversing other games than the one I’m working on.

(continues in Part 3)