Editorial note: This is the first of a two-part tutorial on reverse engineering executables. Today, we’ll walk through the process of finding the section of an executable responsible for performing a certain task, and next week we’ll analyze the assembly instructions which comprise that section.
The Project
The game “Neuromancer”, from 1989, uses a “Pax Verification Code Wheel” as a form of copy protection. I wanted to dig into the codewheel’s implementation because I was curious whether the codes were generated algorithmically, or pulled from a look-up table. The aim of this project, therefore, is to find and analyze the portion of Neuromancer’s machine code that is concerned with codewheel checks.
Following Along
If you’d like to play along at home, get yourself a copy of the PC version of “Neuromancer”. The MD5 hash should be:
0x4b1d2c88f96f298c568c6f8694d727b7
(Note that the MD5 will be different if you’re using a cracked copy of the game.) The memory addresses I cite in this article will likely be different from those you see on your machine, but file offsets, segment offsets, and relative memory locations should be the same.
Beginnings
Let’s look at one of the screens related to the codewheel:
This screen has 5 lines of text displayed on it:
- “PAX – Public Access System”
- “Microsofts”
- “Ratz”
- “Holy Joystick”
- “Enter verification code”
If we check the neuro.exe executable with a hex editor, we find that the first and last strings are in the executable, at file offsets 0x15e1b
and 0x15e37
. (The string “Ratz” is also found in the executable, but at a distant offset, and in a context that appears unrelated.) These strings will enable us to zero in on the instructions related to the codewheel.
Segments and Offsets
Neuromancer is a 16-bit program, so it uses a segmented memory model. In such a model, addressable memory is the 1M address space between 0x00000
and 0xFFFFF
, and memory is addressed with segment:offset pairs. In these pairs the segment and offset are each 16-bit numbers, and the final physical address is equal to segment*16 + offset. In the vast majority of instructions, the segment is implicitly stored in the DS
or CS
registers (depending on the instruction’s opcode) and only the offset is specified by the instruction itself.
Ultimately, we will want to search for instructions which reference the codewheel-related strings. Such a search depends upon knowing the offsets used to refer to those strings, and those offsets in turn depend upon the implicit segment used by the instructions containing the offsets, and the strings’ position in memory relative to that segment.
It is reasonable to guess that the instructions that reference the codewheel-related strings will use the DS
register for segment information. It is also reasonable to guess that the program will set up a primary data segment early in its execution, store the address of that segment in DS
, and rarely change it thereafter. Therefore, if we can break program execution after that initial setup is complete, we can search the DS
segment for the text strings, and find their offsets.
Video interrupts are pretty rare, and likely to be executed only after segment setup has been completed. Therefore, INT 10h
(opcode cd 10
) instructions are good places to set breakpoints at which to search for our strings of interest.
If we fire up the DOS DEBUG command by entering “debug neuro.exe
” at the command line, we can search for INT 10h instructions by entering “s ds:0 l 0 cd 10
” at the debugger prompt. This will search the first 64K of the executable for the byte sequence cd 10
, since DS
is initialized s.t. DS:0000
falls near the beginning of the executable image.
This command turns up two results, at 13CB:DD68
and 13CB:DDDC
. If we enter “g 13cb:dd68 13cb:dddc
” at the debugger prompt, we will find that we don’t hit either breakpoint until we exit the game. (That’s ok, though – just play and exit!) When we do hit the breakpoints, we can enter the following commands to search the DS
segment for the strings we’re interested in:
s ds:0 l 0 "PAX - Public"
s ds:0 l 0 "Enter veri"
Each search returns one result.
s ds:0 l 0 "PAX - Public"
returns59b7:5458
s ds:0 l 0 "Enter veri"
returns59b7:5474
The relative addresses of the strings look ok. From these results, we can conclude that we should look for instructions referencing offsets 0x5458
and 0x5474
.
Code
Let’s restart the DOS DEBUG command by entering “debug neuro.exe
” at the command line, and search for references to these offsets. Any such references will be encoded in opcodes as LSB shorts, so enter these commands at the debug prompt (again, this will search the first 64K of the executable for the byte sequences in question):
s ds:0 l 0 58 54
s ds:0 l 0 74 54
The results are promising:
s ds:0 l 0 58 54
returns13cb:5ac1
s ds:0 l 0 74 54
returns13cb:42ec
and13cb:5b72
There are a pair of references to the two strings near 13cb:5a__
. There’s a good chance the code near there is related to the codewheel.
In general, the machine code for functions is demarcated by the byte sequence c3 55 8b ec
, representing the opcodes RET
, PUSH BP
, and MOV BP, SP
. (The first opcode comes from the end of one function, and the next two from the beginning of another.) Let’s look for these “fence bytes” preceeding our suspected codewheel code. Entering:
s 13cb:5700 l 3c1 c3 55 8b ec
(i.e. search for the byte sequence c3 55 8b ec
in the 0x3c1
bytes preceeding the first reference to one of the PAX strings) at the debugger prompt returns about a dozen addresses, of which the latest is 13cb:5a91
.
If we put a breakpoint at 13cb:5a92
(by entering “g 13cb:5a92
” at the debugger prompt), we find that this code is, in fact, invoked when we attempt to access the PAX terminal in the game. When we hit the breakpoint, we find that the current CS:IP
pointer is actually 13db:5992
– which is, mathematically, identical to 13cb:5a92
. If we then (repeatedly) use the “u” command to unassemble the machine code at and following this address, we find the instructions reproduced below. We will analyze them next week.
13DB:5992 55 PUSH BP
13DB:5993 8BEC MOV BP, SP
13DB:5995 83EC0C SUB SP, +0C
13DB:5998 B87C6A MOV AX, 6A7C
13DB:599B 1E PUSH DS
13DB:599C 50 PUSH AX
13DB:599D B84A54 MOV AX, 544A
13DB:59A0 50 PUSH AX
13DB:59A1 E8ACCB CALL 2550
13DB:59A4 83C406 ADD SP, +06
13DB:59A7 B80100 MOV AX, 0001
13DB:59AA 50 PUSH AX
13DB:59AB E8D3F7 CALL 5181
13DB:59AE 83C402 ADD SP, +02
13DB:59B1 B80800 MOV AX, 0008
13DB:59B4 50 PUSH AX
13DB:59B5 50 PUSH AX
13DB:59B6 E8F6F1 CALL 4BAF
13DB:59B9 83C404 ADD SP, +04
13DB:59BC B80200 MOV AX, 0002
13DB:59BF 50 PUSH AX
13DB:59C0 B85854 MOV AX, 5458
13DB:59C3 50 PUSH AX
13DB:59C4 E813F2 CALL 4BDA
13DB:59C7 83C404 ADD SP, +04
13DB:59CA E801A1 CALL FACE
13DB:59CD 250F00 AND AX, 000F
13DB:59D0 8946FC MOV [BP-04], AX
13DB:59D3 E8F8A0 CALL FACE
13DB:59D6 250F00 AND AX, 000F
13DB:59D9 8946FA MOV [BP-06], AX
13DB:59DC E8EFA0 CALL FACE
13DB:59DF 250F00 AND AX, 000F
13DB:59E2 8946F8 MOV [BP-08], AX
13DB:59E5 B81800 MOV AX, 0018
13DB:59E8 50 PUSH AX
13DB:59E9 B86000 MOV AX, 0060
13DB:59EC 50 PUSH AX
13DB:59ED E8BFF1 CALL 4BAF
13DB:59F0 83C404 ADD SP, +04
13DB:59F3 B80200 MOV AX, 0002
13DB:59F6 50 PUSH AX
13DB:59F7 FF76FC PUSH [BP-04]
13DB:59FA B87C6A MOV AX, 6A7C
13DB:59FD 50 PUSH AX
13DB:59FE E8BC21 CALL 7BBD
13DB:5A01 83C404 ADD SP, +04
13DB:5A04 50 PUSH AX
13DB:5A05 E8D2F1 CALL 4BDA
13DB:5A08 83C404 ADD SP, +04
13DB:5A0B B82000 MOV AX, 0020
13DB:5A0E 50 PUSH AX
13DB:5A0F B86000 MOV AX, 0060
13DB:5A12 50 PUSH AX
13DB:5A13 E899F1 CALL 4BAF
13DB:5A16 83C404 ADD SP, +04
13DB:5A19 B80200 MOV AX, 0002
13DB:5A1C 50 PUSH AX
13DB:5A1D 8B46FA MOV AX, [BP-06]
13DB:5A20 051000 ADD AX, 0010
13DB:5A23 50 PUSH AX
13DB:5A24 B87C6A MOV AX, 6A7C
13DB:5A27 50 PUSH AX
13DB:5A28 E89221 CALL 7BBD
13DB:5A2B 83C404 ADD SP, +04
13DB:5A2E 50 PUSH AX
13DB:5A2F E8A8F1 CALL 4BDA
13DB:5A32 83C404 ADD SP, +04
13DB:5A35 B82800 MOV AX, 0028
13DB:5A38 50 PUSH AX
13DB:5A39 B86000 MOV AX, 0060
13DB:5A3C 50 PUSH AX
13DB:5A3D E86FF1 CALL 4BAF
13DB:5A40 83C404 ADD SP, +04
13DB:5A43 B80200 MOV AX, 0002
13DB:5A46 50 PUSH AX
13DB:5A47 8B46F8 MOV AX, [BP-08]
13DB:5A4A 052000 ADD AX, 0020
13DB:5A4D 50 PUSH AX
13DB:5A4E B87C6A MOV AX, 6A7C
13DB:5A51 50 PUSH AX
13DB:5A52 E86821 CALL 7BBD
13DB:5A55 83C404 ADD SP, +04
13DB:5A58 50 PUSH AX
13DB:5A59 E87EF1 CALL 4BDA
13DB:5A5C 83C404 ADD SP, +04
13DB:5A5F B83800 MOV AX, 0038
13DB:5A62 50 PUSH AX
13DB:5A63 B80800 MOV AX, 0008
13DB:5A66 50 PUSH AX
13DB:5A67 E845F1 CALL 4BAF
13DB:5A6A 83C404 ADD SP, +04
13DB:5A6D B80200 MOV AX, 0002
13DB:5A70 50 PUSH AX
13DB:5A71 B87454 MOV AX, 5474
13DB:5A74 50 PUSH AX
13DB:5A75 E862F1 CALL 4BDA
13DB:5A78 83C404 ADD SP, +04
13DB:5A7B B83800 MOV AX, 0038
13DB:5A7E 50 PUSH AX
13DB:5A7F B8D000 MOV AX, 00D0
13DB:5A82 50 PUSH AX
13DB:5A83 E829F1 CALL 4BAF
13DB:5A86 83C404 ADD SP, +04
13DB:5A89 2BC0 SUB AX, AX
13DB:5A8B 50 PUSH AX
13DB:5A8C B80600 MOV AX, 0006
13DB:5A8F 50 PUSH AX
13DB:5A90 E84405 CALL 5FD7
13DB:5A93 83C404 ADD SP, +04
13DB:5A96 3DFFFF CMP AX, FFFF
13DB:5A99 7508 JNZ 5AA3
13DB:5A9B 83FAFF CMP DX, -01
13DB:5A9E 7503 JNZ 5AA3
13DB:5AA0 E904FF JMP 59A7
13DB:5AA3 B85800 MOV AX, 0058
13DB:5AA6 50 PUSH AX
13DB:5AA7 50 PUSH AX
13DB:5AA8 E804F1 CALL 4BAF
13DB:5AAB 83C404 ADD SP, +04
13DB:5AAE B80200 MOV AX, 0002
13DB:5AB1 50 PUSH AX
13DB:5AB2 B88E54 MOV AX, 548E
13DB:5AB5 50 PUSH AX
13DB:5AB6 E821F1 CALL 4BDA
13DB:5AB9 83C404 ADD SP, +04
13DB:5ABC 8B5EF8 MOV BX, [BP-08]
13DB:5ABF 8A875620 MOV AL, [BX+2056]
13DB:5AC3 2AE4 SUB AH, AH
13DB:5AC5 0346FC ADD AX, [BP-04]
13DB:5AC8 2B46FA SUB AX, [BP-06]
13DB:5ACB 250F00 AND AX, 000F
13DB:5ACE 8946FE MOV [BP-02], AX
13DB:5AD1 8A876620 MOV AL, [BX+2066]
13DB:5AD5 2AE4 SUB AH, AH
13DB:5AD7 B104 MOV CL, 04
13DB:5AD9 D3E0 SHL AX, CL
13DB:5ADB 0146FE ADD [BP-02], AX
13DB:5ADE 8B5EFE MOV BX, [BP-02]
13DB:5AE1 D1E3 SHL BX, 1
13DB:5AE3 8B877620 MOV AX, [BX+2076]
13DB:5AE7 8946FE MOV [BP-02], AX
13DB:5AEA 2BC0 SUB AX, AX
13DB:5AEC 8946FC MOV [BP-04], AX
13DB:5AEF 8946FA MOV [BP-06], AX
13DB:5AF2 EB03 JMP 5AF7
13DB:5AF4 FF46FC INC WORD PTR [BP-04]
13DB:5AF7 8B5EFC MOV BX, [BP-04]
13DB:5AFA 80BF84693C CMP BYTE PTR [BX+6984], 3C
13DB:5AFF 7412 JZ 5B13
13DB:5B01 B103 MOV CL, 03
13DB:5B03 D366FA SHL WORD PTR [BP-06], CL
13DB:5B06 8A878469 MOV AL, [BX+6984]
13DB:5B0A 98 CBW
13DB:5B0B 2D3000 SUB AX, 0030
13DB:5B0E 0146FA ADD [BP-06], AX
13DB:5B11 EBE1 JMP 5AF4
13DB:5B13 B81400 MOV AX, 0014
13DB:5B16 50 PUSH AX
13DB:5B17 E816DF CALL 3A30
13DB:5B1A 83C402 ADD SP, +02
13DB:5B1D B85800 MOV AX, 0058
13DB:5B20 50 PUSH AX
13DB:5B21 50 PUSH AX
13DB:5B22 E88AF0 CALL 4BAF
13DB:5B25 83C404 ADD SP, +04
13DB:5B28 8B46FE MOV AX, [BP-02]
13DB:5B2B 3946FA CMP [BP-06], AX
13DB:5B2E 741F JZ 5B4F
13DB:5B30 B80200 MOV AX, 0002
13DB:5B33 50 PUSH AX
13DB:5B34 B8A254 MOV AX, 54A2
13DB:5B37 50 PUSH AX
13DB:5B38 E89FF0 CALL 4BDA
13DB:5B3B 83C404 ADD SP, +04
13DB:5B3E B80600 MOV AX, 0006
13DB:5B41 50 PUSH AX
13DB:5B42 9A6E33DB13 CALL 13DB:336E
13DB:5B47 83C402 ADD SP, +02
13DB:5B4A E95AFE JMP 59A7
13DB:5B4D EB1A JMP 5B69
13DB:5B4F B80200 MOV AX, 0002
13DB:5B52 50 PUSH AX
13DB:5B53 B8B654 MOV AX, 54B6
13DB:5B56 50 PUSH AX
13DB:5B57 E880F0 CALL 4BDA
13DB:5B5A 83C404 ADD SP, +04
13DB:5B5D B80B00 MOV AX, 000B
13DB:5B60 50 PUSH AX
13DB:5B61 9A6E33DB13 CALL 13DB:336E
13DB:5B66 83C402 ADD SP, +02
13DB:5B69 8BE5 MOV SP, BP
13DB:5B6B 5D POP BP
13DB:5B6C C3 RET