Template metaprogramming

Template metaprogramming is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time execution. The technique is used by a number of languages, the most well-known being C++, but also Curl, D, Eiffel, Haskell, ML and XL.

Components of template metaprogramming

The use of templates as a metaprogramming technique requires two distinct operations: a template must be defined, and a defined template must be instantiated. The template definition describes the generic form of the generated source code, and the instantiation causes a specific set of source code to be generated from the generic form in the template.

Template metaprogramming is generally Turing-complete, meaning that any computation expressible by a computer program can be computed, in some form, by a template metaprogram.

Templates are different from macros. A macro, which is also a compile-time language feature, generates code in-line using text manipulation and substitution. Macro systems often have limited compile-time process flow abilities and usually lack awareness of the semantics and type system of their companion language (an exception should be made with Lisp's macros, which are written in Lisp itself, and is not a simple text manipulation and substitution).

Template metaprograms have no mutable variables— that is, no variable can change value once it has been initialized, therefore template metaprogramming can be seen as a form of functional programming. In fact many template implementations only implement flow control through recursion, as seen in the example below.

Using template metaprogramming

Though the syntax of template metaprogramming is usually very different from the programming language it is used with, it has practical uses. Some common reasons to use templates are to implement generic programming (avoiding sections of code which are similar except for some minor variations) or to perform automatic compile-time optimization such as doing something once at compile time rather than every time the program is run — for instance, by having the compiler unroll loops to eliminate jumps and loop count decrements whenever the program is executed.

Compile-time class generation

What exactly "programming at compile-time" means can be illustrated with an example of a factorial function, which in non-templated C++ can be written using recursion as follows:

int factorial(int n) 
{
    if (n == 0)
       return 1;
    return n * factorial(n - 1);
}

void foo()
{
    int x = factorial(4); // == (4 * 3 * 2 * 1 * 1) == 24
    int y = factorial(0); // == 0! == 1
}

The code above will execute when the program is run to determine the factorial value of the literals 4 and 0.

Instead using template metaprogramming and template specialization to provide the ending condition for the recursion, the factorials used in the program, ignoring any factorial not used, can be calculated at compile-time by

template <int N>
struct Factorial 
{
    enum { value = N * Factorial<N - 1>::value };
};

template <>
struct Factorial<0> 
{
    enum { value = 1 };
};

// Factorial<4>::value == 24
// Factorial<0>::value == 1
void foo()
{
    int x = Factorial<4>::value; // == 24
    int y = Factorial<0>::value; // == 1
}

The code above calculates the factorial value of the literals 4 and 0 at compile time and uses the result as if they were precalculated constants.

While the two versions are similar from the point of view of the program's functionality, the first example calculates the factorials at run time, while the second calculates them at compile time. However, to be able to use templates in this manner, the compiler must know the value of its parameters at compile time, which has the natural precondition that Factorial<X>::value can only be used if X is known at compile time. In other words, X must be a constant literal or a constant expression, such as using sizeof operator.

Compile-time code optimization

The factorial example above is one example of compile-time code optimization in that all factorials used by the program are pre-compiled and injected as numeric constants at compilation, saving both run-time overhead and memory footprint. It is, however, a relatively minor optimisation.

As another, more significant, example of compile-time loop-unrolling, templated metaprogramming can be used to create n-dimensional vector classes (where n is known at compile time). The benefit over a more traditional n-dimensional vector is that the loops can be unrolled, resulting in very optimized code. As an example, consider the addition operator. An n-dimensional vector addition might be written as

template<int dimension>
Vector<dimension>& Vector<dimension>::operator+=(const Vector<dimension>& rhs) 
{
    for (int i = 0; i < dimension; ++i)
        value[i] += rhs.value[i];
    return *this;
}

When the compiler instantiates the templated function defined above, the following code will be produced:

template<>
Vector<2>& Vector<2>::operator+=(const Vector<2>& rhs) 
{
    value[0] += rhs.value[0];
    value[1] += rhs.value[1];
    return *this;
}

The compiler's optimizer is able to unroll the for loop because the template parameter dimension is a constant at compile time.

Static polymorphism

Polymorphism is a common standard programming facility where derived objects can be used as instances of their base object but where the derived objects' methods will be invoked, as in this code

class Base
{
    public:
    virtual void method() { std::cout << "Base"; }
};

class Derived : public Base
{
    public:
    virtual void method() { std::cout << "Derived"; }
};

int main()
{
    Base *pBase = new Derived;
    pBase->method(); //outputs "Derived"
    delete pBase;
    return 0;
}

where all invocations of virtual methods will be those of the most-derived class. This dynamically polymorphic behaviour is obtained by the creation of virtual look-up tables for classes with virtual methods, tables that are traversed at run time to identify the method to be invoked. Thus, run-time polymorphism necessarily entails execution overhead.

However, in many cases the polymorphic behaviour needed is invariant and can be determined at compile time. Then the Curiously Recurring Template Pattern (CRTP) can be used to achieve static polymorphism, which is an imitation of polymorphism in programming code but which is resolved at compile time and thus does away with run-time virtual-table lookups. For example:

template <class Derived>
struct base
{
    void interface()
    {
         // ...
         static_cast<Derived*>(this)->implementation();
         // ...
    }
};

struct derived : base<derived>
{
     void implementation();
};

Here the base class template will take advantage of the fact that member function bodies are not instantiated until after their declarations, and it will use members of the derived class within its own member functions, via the use of a static_cast, thus at compilation generating an object composition with polymorphic characteristics. As an example of real-world usage, the CRTP is used in the Boost iterator library [1].

Another similar use is the "Barton-Nackman trick", sometimes referred to as "restricted template expansion", where common functionality can be placed in a base class that is used not as a contract but as a necessary component to enforce conformant behaviour while minimising code redundancy.

Benefits and drawbacks of template metaprogramming

Compile-time versus execution-time tradeoff: Since all templated code is processed, evaluated and expanded at compile-time, compilation will take longer while the executable code may be more efficient. This overhead is generally small, but for large projects, or projects relying pervasively on templates, it may be significant.
Generic programming: Template metaprogramming allows the programmer to focus on architecture and delegate to the compiler the generation of any implementation required by client code. Thus, template metaprogramming can accomplish truly generic code, facilitating code minimization and better maintainability.^{[citation needed]}
Readability: With respect to C++, the syntax and idioms of template metaprogramming are esoteric compared to conventional C++ programming, and advanced, or even most non-trivial, template metaprogramming can be very difficult to understand. Metaprograms can thus be difficult to maintain by programmers inexperienced in template metaprogramming (though this may vary with the language's implementation of template metaprogramming syntax).
Portability: With respect to C++, due to differences in compilers, code relying heavily on template metaprogramming (especially the newest forms of metaprogramming) might have portability issues.

References

Ulrich W. Eisenecker: Generative Programming: Methods, Tools, and Applications, Addison-Wesley, ISBN 0-201-30977-7
Andrei Alexandrescu: Modern C++ Design: Generic Programming and Design Patterns Applied, Addison-Wesley, ISBN 3-8266-1347-3
David Abrahams, Aleksey Gurtovoy: C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond, Addison-Wesley, ISBN 0-321-22725-5
David Vandervoorde, Nicolai M. Josuttis: C++ Templates: The Complete Guide, Addison-Wesley, ISBN 0-201-73484-2
Manuel Clavel: Reflection in Rewriting Logic: Metalogical Foundations and Metaprogramming Applications, ISBN 1-57586-238-7
What's Wrong with C++ Templates? by Jacob Matthews

External links

The Boost Metaprogramming Library (Boost MPL)
The Spirit Library (built using template-metaprogramming)
The Boost Lambda library (use STL algorithms easily)
Todd Veldhuizen, "Using C++ template metaprograms," C++ Report, Vol. 7 No. 4 (May 1995), pp. 36-43
Template Haskell, type-safe metaprogramming in Haskell
Walter Bright, "Templates Revisited", an article on template metaprogramming in the D programming language.