High Level Assembly
|Operating system||Windows, Linux, FreeBSD, Mac OS X|
High Level Assembly (HLA) is an assembly language developed by Randall Hyde. It allows the use of higher-level language constructs to aid both beginners and advanced assembly developers. It fully supports advanced data types and object-oriented assembly language programming. It uses a syntax loosely based on several high-level languages (HLL), such as Pascal, Ada, Modula-2, and C++, to allow creating readable assembly language programs, and to allow HLL programmers to learn HLA as fast as possible.
Origins and goals
HLA was originally conceived as a tool to teach assembly language programming at the college/university level. The goal is to leverage students' existing programming knowledge when learning assembly language to get them up to speed as fast as possible. Most students taking an assembly language programming course have already been introduced to high-level control structures such as IF, WHILE, FOR, etc. HLA allows students to immediately apply that programming knowledge to assembly language coding early in their course, allowing them to master other prerequisite subjects in assembly before learning how to code low-level forms of these control structures. "The Art of Assembly Language Programming" by Randall Hyde uses HLA for this purpose.
High vs. low-level assembler
The HLA v2.x assembler supports the same low-level machine instructions as a regular, low-level, assembler. The difference is that high-level assemblers (such as HLA, MASM, or TASM on the x86) also support high-level-language-like statements such as IF, WHILE, and so on, and fancier data declaration directives, such as structures/records, unions, and even classes.
Unlike most other assembler tools, the HLA compiler includes a Standard Library: thousands of functions, procedures, and macros that can be used to create full applications with the ease of a high-level language. While assembly language libraries are not new, a language that includes a large standardized library makes programmers far more likely to use such library code rather than simply writing their own library functions.
HLA supports all the same low-level machine instructions as other x86 assemblers and, indeed, HLA's high-level control structures are based on the ones found in MASM and TASM (whose HLL-like features predated the arrival of HLA by several years). One can write low-level assembly code in HLA just as easily as with any other assembler by simply ignoring the HLL-control constructs. Indeed, in contrast to HLLs like Pascal and C(++), HLA doesn't require inline asm statements. HLL-like features appear in HLA to provide a learning aid for beginning programmers by smoothing the learning curve, with the assumption that they will discontinue the use of those statements once they master the low-level instruction set (in practice, many experienced programmers continue to use HLL-like statements in HLA, MASM, and TASM, long after they've mastered the low-level instruction set, but this is usually done for readability purposes).
Of course, it is possible to write "high-level" programs using HLA, avoiding much of the tedium of low-level assembly language programming. Some assembly language programmers reject HLA out of hand because it allows programmers to do this. However, supporting both high-level and low-level programming gives any language an expanded range of applicability. If one must do only low-level-only coding, that is possible. If one must write more readable code, using higher-level statements is an option.
Two HLA features set it apart from other x86 assemblers: its powerful macro system (compile-time language) and the HLA Standard Library.
HLA's compile-time language allows programmers to extend the HLA language with ease, even creating their own little Domain Specific Language to help them easily solve common programming problems. The stdout.put macro briefly described earlier is a good example of a sophisticated macro that can simplify programmers' lives. Consider the following invocation of the stdout.put macro:
stdout.put( "I=", i, " s=", s, " u=", u, " r=", r:10:2, nl );
The stdout.put macro processes each of the arguments to determine the argument's type and then calls an appropriate procedure in the HLA Standard library to handle the output of each of these operands.
Most assemblers provide some sort of macro capability: the advantage that HLA offers over other assemblers is that it is capable of processing macro arguments like "r:10:2" using HLA's extensive compile-time string functions, and HLA's macro facilities can figure out the types of variables and use that information to direct macro expansion.
HLA's macro language provides a special "Context-Free" macro facility. This feature allows programmers to easily write macros that span other sections of code via a "starting" and "terminating" macro pair (along with optional "intermediate" macro invocations that are only available between the start/terminate macros). For example, one can write a fully recursive/nestable SWITCH/CASE/DEFAULT/ENDSWITCH statement using this macro facility.
Because of the HLA macro facilities context-free design, one can nest these switch..case..default..endswitch statements and the nested statements' emitted code will not conflict with the outside statements.
The HLA macro system is actually a subset of a larger feature known as the HLA Compile-Time Language (CTL). The HLA CTL is an interpreted language that is available in an HLA program source file. An interpreter executes HLA CTL statements during the compilation of an HLA source file (hence the name "compile-time language").
The HLA CTL includes many control statements such as #IF, #WHILE, #FOR, #PRINT, an assignment statement (?) and so on. One can also create compile-time variables and constants (including structured data types such as records and unions). The HLA CTL also provides hundreds of built-in functions (including a very rich set of string and pattern-matching functions). The HLA CTL allows programmers to create CTL "programs" that scan and parse strings, allowing those programmers to create "mini-languages" or Domain Specific Embedded Languages (DSELs). The stdout.put macro appearing earlier is an example of such a DSEL. The put macro (in the stdout namespace, hence the name stdout.put) parses its macro parameter list and emits the code that will print its operands.
The HLA Standard Library is an extensive set of prewritten routines and macros (like the stdout.put macro described above) that make life easier for programmers, saving them from reinventing the wheel every time they write a new application. Perhaps just as important, the HLA Standard Library allows programmers to write portable applications that run under Windows or Linux with nothing more than a recompile of the source code. Like the C standard library, the HLA Standard Library allows abstracting away low-level OS calls, so the same set of OS APIs can serve for all operating systems that HLA supports. Of course, an assembly language allows making any needed OS calls, but as long as programmers use the HLA Standard Library API set, writing OS-portable programs is easy.
The HLA Standard Library provides thousands of functions, procedures, and macros. As of mid-2010 (and the list changes over time), HLA v2.12's Standard Library including functions in these categories:
- Command-line argument processing
- Array (dynamic) declaration and manipulation
- Bit manipulation
- Blob (binary large object) manipulation
- Character manipulation
- Character set manipulation
- Date and time functions
- Object-oriented file I/O
- Standard file I/O
- File system manipulation functions (e.g., delete, rename, and change directory)
- HLA-related declarations and functions
- The HLA Object Windows Library (Object-Oriented Framework for Win32 programming)
- Linked list manipulation
- Mathematical functions
- Memory allocation and management
- FreeBSD-specific APIs
- Linux-specific APIs
- Mac OS X-specific APIs
- Win32-specific APIs
- Text console functions
- Coroutine support
- Environment variable support
- Exception handling support
- Memory-mapped file support
- Sockets and client/server object support
- Thread and synchronization support
- Timer functions
- Pattern matching (regular expressions and context-free languages) support
- Random number generators
- Remote Procedure Call support
- Standard error output functions
- Standard output functions
- Standard input functions
- String functions
- Table (associative) support
- Zero-terminated string functions
The HLA v2.x language system is a command-line driven tool that consists of several components, including a "shell" program (e.g., hla.exe under Windows), the HLA language compiler (e.g., hlaparse.exe), a low-level translator (e.g., the HLABE, or HLA Back Engine), a linker (link.exe under Windows, ld under Linux), and other tools such as a resource compiler for Windows. Versions before 2.0 relied on an external assembler back end; versions 2.x and later of HLA use the built-in HLABE as the back-end object code formatter.
The HLA "shell" application processes command line parameters and routes appropriate files to each of the programs that make up the HLA system. It accepts as input ".hla" files (HLA source files), ".asm" files (source files for MASM, TASM, FASM, NASM, or Gas assemblers), ".obj" files for input to the linker, and ".rc" files (for use by a resource compiler).
Source Code Translation
Originally, the HLA v1.x tool compiled its source code into an intermediate source file that a "back-end" assembler such as MASM, TASM, FASM, NASM, or Gas would translate into the low-level object code file. As of HLA v2.0, HLA included its own "HLA Back Engine" (HLABE) that provided the low-level object code translation. However, via various command-line parameters, HLA v2.x still has the ability to translate an HLA source file into a source file that is compatible with one of these other assemblers.
HLA Back Engine
The HLA Back Engine (HLABE) is a compiler back end that translates an internal intermediate language into low-level PE, COFF, ELF, or Mach-O object code. An HLABE "program" mostly consists of data (byte) emission statements, 32-bit relocatable address statements, x86 control-transfer instructions, and various directives. In addition to translating the byte and relocatable address statements into the low-level object code format, HLABE also handles branch-displacement optimization (picking the shortest possible form of a branch instruction).
Although the HLABE is incorporated into the HLA v2.x compiler, it is actually a separate product. It is public domain and open source (hosted on SourceForge.net).
HLA was used to write HLA Adventure, a text adventure game in the public domain. HLA has also been used to develop the real-time digital control system for TRIGA Reactors (General Atomics).
- Richard Blum, Professional assembly language, Wiley, 2005, ISBN 0-7645-7901-0, p. 42
- Randall Hyde, Write Great Code: Understanding the machine, No Starch Press, 2004, ISBN 1-59327-003-8, pp. 14–15 and used throughout the book
- Randall Hyde, The Art of Assembly Language, 2nd Edition, No Starch Press, 2010, ISBN 1-59327-207-3, used throughout the book
- Official website
- Downloads for Windows, Mac OS X, and Linux: http://www.plantation-productions.com/Webster/HighLevelAsm/dnld.html