Low-level programming language

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, a low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture. Generally this refers to either machine code or assembly language. The word "low" refers to the small or nonexistent amount of abstraction between the language and machine language; because of this, low-level languages are sometimes described as being "close to the hardware".

Low-level languages can be converted to machine code without using a compiler or interpreter, and the resulting code runs directly on the processor. A program written in a low-level language can be made to run very quickly, and with a very small memory footprint; an equivalent program in a high-level language will be more heavyweight. Low-level languages are simple, but are considered difficult to use, due to the numerous technical details which must be remembered.

By comparison, a high-level programming language isolates the execution semantics of a computer architecture from the specification of the program, which simplifies development.

Low-level programming languages are sometimes divided into two categories: first generation, and second generation.

Machine code[edit]

Machine code is the only language a microprocessor can process directly without a previous transformation. Currently, programmers almost never write programs directly in machine code, because it requires attention to numerous details which a high-level language would handle automatically, and also requires memorizing or looking up numerical codes for every instruction that is used. For this reason, second generation programming languages provide one abstraction level on top of the machine code.

Example: A function in 32-bit x86 machine code to calculate the nth Fibonacci number:

8B542408 83FA0077 06B80000 0000C383
FA027706 B8010000 00C353BB 01000000
B9010000 008D0419 83FA0376 078BD98B
C84AEBF1 5BC3

Assembly[edit]

Assembly language has no semantics and no specification, being only a mapping of human-readable symbols, including symbolic addresses, to opcodes, addresses, numeric constants, strings and so on. Typically, one machine instruction is represented as one line of assembly code. Assemblers produce object files which may be linked with other object files or loaded on their own.

Most assemblers provide macros.

Example: The same Fibonacci number calculator as above, but in x86 assembly language using MASM syntax:

fib:
    mov edx, [esp+8]
    cmp edx, 0
    ja @f
    mov eax, 0
    ret
 
    @@:
    cmp edx, 2
    ja @f
    mov eax, 1
    ret
 
    @@:
    push ebx
    mov ebx, 1
    mov ecx, 1
 
    @@:
        lea eax, [ebx+ecx]
        cmp edx, 3
        jbe @f
        mov ebx, ecx
        mov ecx, eax
        dec edx
    jmp @b
 
    @@:
    pop ebx
    ret

Low-level programming in high-level languages[edit]

Experiments with hardware support in high-level languages in the late 1960s led to such languages as PL/S, BLISS, BCPL, and extended ALGOL for Burroughs large systems being used for low-level programming. Forth also has applications as a systems language. However, the language that became pre-eminent in systems programming was C.

C is considered a third generation programming language, since it is structured and abstracts from machine code (historically, no second generation programming language emerged that was particularly suitable for low-level programming). However, many programmers today might refer to C as low-level, as it lacks a large runtime-system (no garbage collection etc.), basically supports only scalar operations, and provides direct memory addressing. It therefore readily blends with assembly language and the machine level of CPUs and microcontrollers. C's ability to abstract from the machine level means that the same code can be compiled for different hardware platforms; however, fine-grained control at the systems level is still possible providing that the target platform has certain broadly-defined features in place, such as a flat memory model, and memory that is divided into bytes. C programmes may require a certain amount of 'tweaking', often implemented by conditional compilation, for different target platforms. The process of adapting a systems programme for a different platform is known as porting.

Example - a function that calculates the nth Fibonacci number in C

unsigned int fib(unsigned int n)
{
    if (n <= 0)
        return 0;
    else if (n <= 2)
        return 1;
    else {
        int a,b,c;
        a = 1;
        b = 1;
        while (true) {
            c = a + b;
            if (n <= 3) return c;
            a = b;
            b = c;
            n--;
        }
    }
}

Relative meaning[edit]

The terms high-level and low-level are inherently relative. Some decades ago, the C language, and similar languages, were most often considered "high-level". Many programmers today might refer to C as low-level.

Assembly language may itself be regarded as a higher level (but often still one-to-one if used without macros) representation of machine code, as it supports concepts such as constants and (limited) expressions, sometimes even variables, procedures, and data structures. Machine code, in its turn, is inherently at a slightly higher level than the microcode or micro-operations used internally in many processors.

See also[edit]