Intel HEX is a file format for conveying binary information for applications like programming microcontrollers, EPROMs, and other kinds of chips. In a typical application, a compiler or assembler converts a program's source code (such as in C or assembly language) to machine code and outputs it into a HEX file. That file is then imported by a programmer to "burn" the machine code into a ROM, or is transferred to the target system for loading and execution.
Each line of Intel HEX file consists of six parts:
- Start code, one character, an ASCII colon ':'.
- Byte count, two hex digits, a number of bytes (hex digit pairs) in the data field. 16 (0x10) or 32 (0x20) bytes of data are the usual compromise values between line length and address overhead.
- Address, four hex digits, a 16-bit address of the beginning of the memory position for the data. Limited to 64 kilobytes, the limit is worked around by specifying higher bits via additional record types.
- Record type, two hex digits, 00 to 05, defining the type of the data field.
- Data, a sequence of n bytes of the data themselves, represented by 2n hex digits.
- Checksum, two hex digits - the least significant byte of the two's complement of the sum of the values of all fields except fields 1 and 6 (Start code ":" byte and two hex digits of the Checksum). It is calculated by adding together the hex-encoded bytes (hex digit pairs), then leaving only the least significant byte of the result, and making a 2's complement (either by subtracting the byte from 0x100, or inverting it by XOR-ing with 0xFF and adding 0x01). If you are not working with 8-bit variables, you must suppress the overflow by AND-ing the result with 0xFF. The overflow may occur since both 0x100-0 and (0x00 XOR 0xFF)+1 equal 0x100. If the checksum is correctly calculated, adding all the bytes (the Byte count, both bytes in Address, the Record type, each Data byte and the Checksum) together will always result in a value wherein the least significant byte is zero (0x00).
For example, on :0300300002337A1E
03 + 00 + 30 + 00 + 02 + 33 + 7A = E2, 2's complement is 1E
There are six record types:
- 00, data record, contains data and 16-bit address. The format described above.
- 01, End Of File record. Must occur exactly once per file in the last line of the file. The byte count is 00 and the data field is empty. Usually the address field is also 0000, in which case the complete line is ':00000001FF'. Originally the End Of File record could contain a start address for the program being loaded, e.g. :00AB2F0125 would cause a jump to address AB2F. This was convenient when programs were loaded from punched paper tape.
- 02, Extended Segment Address Record, segment-base address (two hex digit pairs in big endian order). Used when 16 bits are not enough, identical to 80x86 real mode addressing. The address specified by the data field of the most recent 02 record is multiplied by 16 (shifted 4 bits left) and added to the subsequent data record addresses. This allows addressing of up to a megabyte of address space. The address field of this record has to be 0000, the byte count is 02 (the segment is 16-bit). The least significant hex digit of the segment address is always 0.
- 03, Start Segment Address Record. For 80x86 processors, it specifies the initial content of the CS:IP registers. The address field is 0000, the byte count is 04, the first two bytes are the CS value, the latter two are the IP value.
- 04, Extended Linear Address Record, allowing for fully 32 bit addressing (up to 4GiB). The address field is 0000, the byte count is 02. The two data bytes (two hex digit pairs in big endian order) represent the upper 16 bits of the 32 bit address for all subsequent 00 type records until the next 04 type record comes. If there is not a 04 type record, the upper 16 bits default to 0000. To get the absolute address for subsequent 00 type records, the address specified by the data field of the most recent 04 record is added to the 00 record addresses.
- 05, Start Linear Address Record. The address field is 0000, the byte count is 04. The 4 data bytes represent the 32-bit value loaded into the EIP register of the 80386 and higher CPU.
Some applications may use custom record types.
- 20, ROM code, used by Samsung SAMA assembler
- 22, Extension code, used by Samsung Smart Studio microcontroller development system
Beware! While addresses are always given as big endian byte addresses, it's unspecified how to interpret the data bytes. Whether they are taken as bytes, 16- or 32 bit little- or big endian words is application specific.
Sometimes the terms I8HEX, I16HEX, I32HEX, resp. INTEL 8/16/32 are used, usually in the context of x86 CPUs. The format of the files are all the same, but the terms imply using a particular subset of the possible record types: I8HEX uses only types 00/01 (16 bit addresses), I16HEX adds types 02/03 (20 bit addresses), and I32HEX adds 04/05 (32 bit addresses).
A slightly different ASCII formatting termed SREC is used with Motorola processors.
:10010000214601360121470136007EFE09D2190140 :100110002146017EB7C20001FF5F16002148011988 :10012000194E79234623965778239EDA3F01B2CAA7 :100130003F0156702B5E712B722B732146013421C7 :00000001FF
- Intel Hexadecimal Object File Format Specification 1988 (PDF), Revision A, January 6, 1988
- Format description at PIC List
- Format description
- SRecord, multi-platform GPL'ed tool for manipulating EPROM load files.
- Binex, a converter between Intel HEX and binary.
- libgis, open source library to handle Intel HEX (and more).
- SB-Projects: fileformats: intel hex, clear and well structured reference with various examples