Inline assembler

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

Advantages

The embedding of assembly language code is usually done for one of three reasons:

Optimization: Programmers can use assembly language code to implement the most performance-sensitive parts of their program's algorithms, code that is apt to be more efficient than what might otherwise be generated by the compiler.

Access to processor specific instructions: Most processors offer special instructions, such as Compare and Swap and Test and Set instructions which may be used to construct semaphores or other synchronization and locking primitives. Nearly every modern processor has these or similar instructions, as they are necessary to implement multitasking. Examples of specialized instructions are found in the SPARC VIS, Intel MMX and SSE, and Motorola Altivec instruction sets.

System calls: High-level languages rarely have a direct facility to make arbitrary system calls, so assembly code is used.

Syntax in language standards

The ISO C++ standard and ISO C standards (annex J) specify a conditionally supported syntax for inline assembler:

    An asm declaration has the form
    asm-definition:
    asm ( string-literal ) ;
    The asm declaration is conditionally-supported; its meaning is implementation-defined.

Example of a system call

Calling an operating system directly is generally not possible under a system using protected memory. The OS runs at a more privileged level (kernel mode) than the user (user mode); a (software) interrupt is used to make requests to the operating system. This is rarely a feature in a higher-level language, and so wrapper functions for system calls are written using inline assembler.

The following C code example shows a system call wrapper in AT&T assembler syntax, using the GNU Assembler. Such calls are normally written with the aid of macros; the full code is included for clarity.

The format of basic inline assembly is very straightforward:

asm (<assembly code>);

Example:

asm ("movl %ecx, %eax"); /* moves the contents of ecx to eax */

or

__asm__ ("movb %bh, (%eax)"); /* moves the byte from bh to the memory pointed by eax */

Both asm and __asm__ are valid. __asm__ can be used if the keyword asm conflicts with something else in the program.

extern int errno;

int funcname(int arg1, int *arg2, int arg3)
{
  int res;
  __asm__ volatile (
    "int $0x80"        /* make the request to the OS */
    : "=a" (res),      /* return result in eax ("a") */
      "+b" (arg1),     /* pass arg1 in ebx ("b") */
      "+c" (arg2),     /* pass arg2 in ecx ("c") */
      "+d" (arg3)      /* pass arg3 in edx ("d") */
    : "a"  (128)       /* pass system call number in eax ("a") */
    : "memory", "cc"); /* announce to the compiler that the memory and condition codes have been modified */

  /* The operating system will return a negative value on error;
   * wrappers return -1 on error and set the errno global variable */
  if (-125 <= res && res < 0) {
    errno = -res;
    res   = -1;
  }  
  return res;
}

Example of optimization and processor-specific instructions

This example of inline assembly from the D programming language shows code that computes the tangent of x using the x86's FPU instructions. This is faster than using the floating-point operations that would be generated by the compiler, and it allows the programmer to make use of the fldpi instruction, which loads the closest approximation of pi possible on the x86 architecture.

// Compute the tangent of x
real tan(real x)
{
   asm
   {
       fld     x[EBP]                  ; // load x
       fxam                            ; // test for oddball values
       fstsw   AX                      ;
       sahf                            ;
       jc      trigerr                 ; // x is NAN, infinity, or empty
                                         // 387's can handle denormals
SC18:  fptan                           ;
       fstp    ST(0)                   ; // dump X, which is always 1
       fstsw   AX                      ;
       sahf                            ;
       jnp     Lret                    ; // C2 = 1 (x is out of range)
       ;// Do argument reduction to bring x into range
       fldpi                           ;
       fxch                            ;
SC17:  fprem1                          ;
       fstsw   AX                      ;
       sahf                            ;
       jp      SC17                    ;
       fstp    ST(1)                   ; // remove pi from stack
       jmp     SC18                    ;
   }
trigerr:
   return real.nan;
Lret:
   ;
}

External links