Hygienic macro

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Hygienic macros are macros whose expansion is guaranteed not to cause collisions with existing symbol definitions. They are a feature of programming languages such as Scheme and Dylan.

Contents

[edit] The hygiene problem

In a programming language that has unhygienic macros, it is possible for existing variable bindings to be hidden from a macro by variable bindings that are created during its expansion. In C, this problem can be illustrated by the following fragment:

#define INCI(i) {int a=0; ++i;}
int main(void)
{
    int a = 0, b = 0;
    INCI(a);
    INCI(b);
    printf("a is now %d, b is now %d\n", a, b);
    return 0;
}

Running the above through the C preprocessor produces:

int main(void)
{
    int a = 0, b = 0;
    {int a=0; ++a;};
    {int a=0; ++b;};
    printf("a is now %d, b is now %d\n", a, b);
    return 0;
}

So the variable a declared in the top scope is never altered by the execution of the program, as the output of the compiled program shows:

a is now 0, b is now 1

Note that some C compilers, such as gcc, have an option like -Wshadow that warns when a local variable shadows a global variable, which would have caught the above problem. The simplest and least robust solution is to give the macro's variables unique names:

#define INCI(i) {int INCIa=0; ++i;}
int main(void)
{
    int a = 0, b = 0;
    INCI(a);
    INCI(b);
    printf("a is now %d, b is now %d\n", a, b);
    return 0;
}

Until a variable named INCIa is created, this solution produces the correct output:

a is now 1, b is now 1

The "hygiene problem" can extend beyond variable bindings. Consider this Common Lisp macro:

(defmacro my-unless (condition &body body)
 `(if (not ,condition)
    (progn
      ,@body)))

While there are no references to variables in this macro, it assumes the symbols "if", "not", and "progn" are all bound to their usual function definitions. If, however the above macro is used in the following code:

(flet ((not (x) x))
  (my-unless t
    (format t "This should not be printed!")))

Because the definition of "not" has been locally altered, the behavior is undefined. Redefining standard functions and operators, globally or locally, invokes undefined behavior according to ANSI Common Lisp. Such usage can be diagnosed by the implementation as erroneous.

Of course, the problem can occur for program-defined functions which are not protected in the same way:

(defmacro my-unless (condition &body body)
 `(if (user-defined-operator ,condition)
    (progn
      ,@body)))
(flet ((user-defined-operator (x) x))
  (my-unless t
    (format t "This should not be printed!")))

The proper Common Lisp solution to this problem is to use packages. The my-unless macro can reside in its own package, where user-defined-operator is a private symbol in that package.

The symbol user-defined-operator occurring in the user code will then be a different symbol, unrelated to the one used in the macro.

[edit] Strategies

In some languages such as Common Lisp, Scheme and others of the Lisp language family, macros provide a powerful means of extending the language. Here the lack of hygiene in conventional macros is resolved by several strategies.

  • Obfuscation. If the programmer needs to use temporary storage during the expansion of a macro, he can use one with an unusual name and hope that the same name will never be used in a program that uses his macro. Of course any programmer knowing of gensym won't do this. (See next point)
  • Temporary symbol creation. In some programming languages it is possible for a new variable name, or symbol, to be generated and bound to a temporary location. The language processing system ensures that this never clashes with another name or location in the execution environment. The responsibility for choosing to use this feature within the body of a macro definition is left to the programmer. This method was used in MacLisp, where a function named "gensym" could be used to generate a new symbol name. Similar functions (usually named gensym as well) exist in many Lisp-like languages, including the widely implemented Common Lisp[1] standard.
  • Read-time Uninterned Symbol. This is similar to the first solution in that a single name is shared by multiple expansions of the same macro. Unlike an unusual name, however, a read time uninterned symbol is used (denoted by the #: notation), for which it is impossible to occur outside of the macro.
  • Packages. Instead of an unusual name or an uninterned symbol, the macro simply uses a private symbol from the package in which the macro is defined. The symbol will not accidentally occur in user code. User code would have to reach inside the package using the double colon :: notation to give itself permission to use the private symbol, for instance cool-macros::secret-sym. At that point, the issue of accidental lack of hygiene is moot. Thus the Lisp package system provide a viable, complete solution to the macro hygiene problem, which can be regarded as an instance of name clashing.
  • Hygienic transformation. The processor responsible for transforming the patterns of the input form into an output form detects symbol clashes and resolves them by temporarily changing the names of symbols. This kind of processing is supported by Scheme's "let-syntax" and "define-syntax" macro creation systems. The basic strategy is to identify bindings in the macro definition and replace those names with gensyms, and to identify free variables in the macro definition and make sure those names are looked up in the scope of the macro definition instead of the scope where the macro was used.

[edit] Hygienic macro systems for Scheme

[edit] Syntax-rules

Syntax-rules is the standard high-level macro system of R5RS.

(define-syntax swap!
  (syntax-rules ()
    ((_ a b)
     (let ((temp a))
       (set! a b)
       (set! b temp)))))

[edit] Syntax-case

Syntax-case is a low- and high-level macro system that is part of R6RS.

(define-syntax swap!
  (lambda (stx)
    (syntax-case stx ()
      ((_ a b)
       (syntax
        (let ((temp a))
          (set! a b)
          (set! b temp)))))))

[edit] Syntactic closures

Syntactic closures are another type of macro system.

(define-syntax swap!
   (sc-macro-transformer
    (lambda (form environment)
      (let ((a (close-syntax (cadr form) environment))
            (b (close-syntax (caddr form) environment)))
        `(let ((temp ,a))
           (set! ,a ,b)
           (set! ,b temp))))))

[edit] Explicit renaming

Explicit renaming is another type of macro system.

(define-syntax swap!
 (er-macro-transformer
  (lambda (form rename compare)
    (let ((a (cadr form))
          (b (caddr form))
          (temp (rename 'temp)))
      `(,(rename 'let) ((,temp ,a))
           (,(rename 'set!) ,a ,b)
           (,(rename 'set!) ,b ,temp))))))

[edit] References


[edit] See also

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export