Understanding the iNES Header
Headers are small chunks of bytes that come at the beginning of files, and tell programs how to read and interpret those files. Pretty much any file you might open, from an mp3 to a PDF, begins with a header.
When NES games were first developed, they didn't need headers. They were burned into integrated circuits on game cartridges that interfaced directly with the NES hardware—in other words, they weren't computer files, and they weren't being interpreted by software programs. With the rise of software emulators in the nineties, however, it became necessary to define a header format for the files that those emulators could open and read.
The header format that caught on and became the standard was the one used by iNES, developed primarily by Marat Fayzullin. This is what makes the .nes file format what it is. Let's take a look at an example of the header in a NES ROM file.
.segment "HEADER"
.byte $4e, $45, $53, $1a
.byte $02
.byte $01
.byte %00000010
.byte %00000000
.byte %00000000 There's a lot packed into these few bytes, so let's take them line by line.
.segment "HEADER" This is a structural command to the compiler. It tells the compiler to put the bytes that follow into the memory area called HEADER, which should be defined (in the compiler's config file) as the very first region of memory. The header needs to come first, because it's what tells any program that opens this file—like an emulator—what kind of file it is and how to handle it.
Bytes 0-3
.byte $4e, $45, $53, $1a These four bytes define the file format—in this case, an iNES rom, aka a .nes file. Why these particular bytes? They're the ASCII codes for the letters N, E, S, followed by an ASCII code called SUB. SUB is short for "substitute character," and it has a number of different uses, one of which is traditionally to mark the end of a stream of characters. In other words, these bytes are saying: N-E-S-that's all.
Note that most programs won't know how to interpret this header. If you opened a .nes file in Photoshop, for instance, it probably wouldn't know what to do with it. But any emulator worth its salt will see this string of four bytes and say, "Aha! A NES game!"
Byte 4
.byte $02 Now we're getting to the good stuff. This byte tells the emulator the size of PRG ROM in the cart, as measured in 16-kilobyte blocks. Our example header, therefore, defines a cartridge with 32k of PRG ROM—in other words, a standard NES cart.
Byte 5
.byte $01 Similar to the previous byte, Byte 5 defines the size of CHR ROM in the cartridge. There are two key differences. First and most importantly, CHR ROM is measured here in 8-kilobye chunks. So in our example, a value of $01 means 8k of CHR ROM—again, a standard NES cart. Second, in the rare case of carts that use CHR RAM instead of ROM, you'll see a value of $00 here.
Byte 6
.byte %00000010 Byte 6 does double duty. The four low bits (i.e. the four bits on the right) are a series of individual flags that tell us different things about the game cart. The four high bits (the four bits on the left) are actually the low bits of the mapper number. I know, it's confusing.
Let's start with the flags. I'll write them out with the example values first, the bit numbers below those, and short explanations aligned to each bit. This diagram is adapted from one at the NESdev wiki.
0010
3210
||||
|||+- Mirroring: 0: horizontal (aka vertical scrolling)
||| 1: vertical (aka horizontal scrolling)
||+-- 1: Cartridge contains battery-backed PRG RAM or other persistent memory
|+--- 1: 512-byte trainer at $7000-$71FF
+---- 1: Ignore mirroring control or above mirroring bit; instead provide four-screen VRAM Got all that? Okay, taking them from right to left...
Bit 0 is important. It tells the emulator whether the cartridge's nametables use horizontal or vertical mirroring. Before the creation of mappers that allowed for software switching, the mirroring setting was actually hardwired into each cart's circuitry and couldn't be changed. In our example, 0 tells the emulator to use horizontal mirroring. (Note that for mappers that don't use hardwired mirroring, this bit is ignored.)
Bit 1 tells the emulator if the cartridge has some kind of persistent memory scheme, usually battery-backed PRG RAM. This is how carts like The Legend of Zelda let you save your game without a password system.
Bit 2 alerts the emulator to the presence of a trainer, a section of code that helps emulators properly handle some rare situations. I honestly don't know much about trainers, but they're not used often.
A few games, such as Gauntlet, found ways to store four separate screens in the four nametables. In those cases, Bit 3 is turned on.
Okay, now let's look at the four high bits.
0000 Thrilling! This tells us that the low part of the byte that defines the mapper number is 0. We'll attach this to the high part, found momentarily, to determine what mapper this cart uses.
Byte 7
.byte %00000000 Just like Byte 6, Byte 7 is split in halves. The high four bits of this byte are the high bites of the mapper number, and the low bits are four more flags (though they're more obscure than the flags in Byte 6).
Let's start by figuring out the mapper number. We'll take the high bits from Byte 7...
0000 Oh, okay. These are the high bits of the mapper number, so we put them at the front of the whole thing. That gives us a mapper number of 0000 0000, or 0. This game uses the NROM mapper, so it's probably an original Nintendo release or something set up like it.
Whew! Almost done. Let's quickly cover the other half of this byte, which is easy, because it has only one flag. Bit 0 indicates a game designed for the VS Unisystem, a Japan-only NES variant (which this one isn't, so the flag is a 0).
Byte 8
.byte %00000000 This byte defines the size of a cart's PRG RAM, if it exists. However, it was a late addition to the existing iNES standard and wasn't widely adopted. I don't have any experience with this setting.
One More Example
Let's take a quick spin through one more iNES header.
.segment "HEADER"
.byte $4e, $45, $53, $1a
.byte $02
.byte $04
.byte %00110001
.byte %00000000
.byte %00000000 There's no need to step through every line; let's just look at the differences (though it's important to remember that the first four bytes have to be the same). Byte 5 is $04, meaning this cart has 32kb of CHR ROM. That suggests right away that a mapper other than good old NROM is in play. And indeed, when we look at Byte 6, we see that the high bits are 0011, or 3. The high bits of Byte 7 are 0000, so our mapper number is 0000 0011, or 3. That happens to be the CNROM mapper, which can swap CHR ROM via bank switching. Nice!
Looking back at Byte 6, the low bits are also different from our first example. Bit 0 is a 1, telling us this cart is hardwired for vertical mirroring. And bit 1 is a 0, meaning the game doesn't have battery-backed persistence.
And that's how you read an iNES header!