This is a proposition for an update to the UNIF format (currently at revision 7) to make clarifications and add some functionality. I'm fully open to feedback and suggestions on this.
A summary of the changes proposed:
- Make provisions for including arbitrary data files in a ROM, such as box art, an HTML manual, or other such things.
- Create a more robust meta-data system, using a single unified 'META' chunk instead of having them in their own 'NAME', 'READ', et al., chunks.
- Make some clarifications and mandates.
THE PROPOSED CHANGES
(In no particular order)
CHANGE 1: CLARIFY HOW BOOLEAN CHUNKS (such as BATR and VROR) ARE SPECIFIED
* If the chunk is present but empty, it evaluates to TRUE.
* If the chunk is present and its first byte is non-zero, it evaluates to TRUE.
* If the chunk is present and its first byte is zero, it evaluates to FALSE.
* If the chunk is not present, it evaluates to FALSE.
CHANGE 2: MINOR SPECIFICATION UPDATE FOR 'MAPR'
If the chunk begins with the text 'NES-', omit that from the result. For example, if a MAPR chunk contains 'NES-NROM', it evalues to 'NROM'.
CHANGE 3: SPECIFY THE PCKn/CCKn ALGORITHM AS CRC32
This is just to clarify that point, as it is not mentioned in the UNIFv7 spec. Also, the 32-bit (4-byte) result of the CRC32 calculation should be stored in little-endian order, in accordance with the rest of UNIF.
CHANGE 4: MAKE PROVISIONS FOR ARBITRARY IN-ROM DATA FILES
(Such as box art, an HTML-based manual, or whatever)
This change defines a new chunk, named 'FILE', for this purpose. It is like so:
A. The beginning of the chunk's contents is a MIME type for the file, e.g. 'image/jpeg' or 'text/html'. This is nul-terminated (it ends in 0x00).
B. Then follows the filename. It is limited to 32 characters in length, of A-Z, a-z, 0-9, underscores, dashes, periods, and spaces.
C. Then follows the file contents.
More than one FILE chunk is permitted; one file per FILE chunk. This is for e.g. having an HTML manual be spread across multiple pages, or to have images that are referenced inline. To link, one would just specify the filename, e.g. <a href="page2.html">
No provision for a directory structure is made. An emulator ought to be able to extract all FILE contents to some temporary directory and invoke another process (e.g. a web browser) to view it.
CHANGE 5: A MORE ROBUST META-DATA SYSTEM
Instead of NAME, READ, DINF, et al., have them in a unified meta-data chunk, named META, which can contain multiple entries. Each such entry would be like the following:
A. Name of the entry ('TITLE', etc), consisting entirely of A-Z, a-z, 0-9, and underscores; in addition, it must start with A-Z or an underscore. It is terminated by a '=' character.
B. The content of the entry, plain text, terminated by a 0x00 byte. Linebreaks and such are permitted. This is limited to 16384 bytes per entry.
Some proposed default/standard entry names (called "tags"):
* TITLE; title of the ROM
* SUBTITLE; for brevity of TITLE. For example: TITLE=Zelda II SUBTITLE=The Adventure of Link. Can also work to designate hacks, prototypes, or other such things.
* AUTHOR; for homebrews or ROM hacks/translations; who created/hacked it. Multiple AUTHOR tags are permitted to represent more than one author; one AUTHOR tag per author.
* DUMP_INFO; info on the dumping process, to replace UNIFv7's DINF. These fields are separated by a 0x0A byte (a Unix-style linebreak). The first is the name of the person who dumped the ROM; the next is the date (in a format such as: '2004/08/02 07:11:13 UTC -07:00'); the next line is the name(s) of the tool(s) that performed the dump.
* README; as for UNIFv7's READ chunk.
UNIFv8 files would be free to use other tags. Such tags ought to be reasonably self-explanatory. META is set up so that if all the fields were displayed unformatted (as opposed to e.g. interpreting the DUMP_INFO tag's date field and formatting it according to local date/time conventions), it would still be readable.
CHANGE 6: CHARACTER ENCODING SPECIFICATION
Mandate that all text fields (e.g. META) are to use UTF-8 character encoding. This would facilitate international usage.
CHANGE 7: CLARIFY THE VERSION NUMBER IN THE UNIF HEADER
The specification for UNIFv7 states that the UNIF revision number in the header is 0x00000004 for some reason. For UNIFv8 this shall be clarified: 0x00000008
CHANGE 8: CHUNK POSITIONING
All chunks shall start on a 64-bit (8-byte) boundary relative to the start of the file. If a chunk does not *end* on such a boundary, then as many 0x00 bytes as necessary are used to fill in the space up to the next boundary, at which point the next chunk starts. If the last chunk does not end on such a boundary, no such filler will be present. (Thanks to Jamethiel for that idea)
...just another vision... Studios