02 Dec, 2013

Episode Eight: The Curious Case of the Recurring Template Pattern

Inheritance is a handy tool to simplify design and avoid code duplication. However, good old fashion inheritance is not always the right tool for the job. What if a base class could know, at compile time, who derives from it? Enter the curiously recurring template pattern (CRTP)...

The Pattern

The curiously recurring template pattern (CRTP) is an idiom in which a class X derives from a class template instantiation using X itself as template argument. The name of this idiom was coined by Jim Coplien, who had observed it in some of the earliest C++ template code. It splits generic and concrete functionality by delegating to its derived class; but unlike regular inheritance and virtual functions, the base class knows at compile time its derived type —this is often referred to as static polymorphism—.

In its general form:

template <typename Derived>
struct base {
  void interface() {
    static_cast<Derived*>(this)->implementation();
  }
};

struct derived : base<derived> {
  void implementation() { /*...*/ }
};

derived d = /*...*/;
d.interface();

This may seem odd at first, using types and functions before they are defined, but alas it only looks that way. The base class is a template, so it will not be instantiated until actually required. That is at the point base<derived> is used as a base class.

[14.7.1/1] Unless a class template specialization has been explicitly instantiated (14.7.2) or explicitly specialized (14.7.3), the class template specialization is implicitly instantiated when the specialization is referenced in a context that requires a completely-defined object type or when the completeness of the class type affects the semantics of the program. The implicit instantiation of a class template specialization causes the implicit instantiation of the declarations, but not of the definitions, default arguments, or exception-specifications of the class member functions, member classes, scoped member enumerations, static data members and member templates; and it causes the implicit instantiation of the definitions of unscoped member enumerations and member anonymous unions.

At that time derived has been declared, and it is an incomplete type until it's fully defined —at the semicolon ending its body—.

[9.1/2] A class declaration introduces the class name into the scope where it is declared and hides any class, variable, function, or other declaration of that name in an enclosing scope (3.3). [...]

[9.1./4] [ Note: The declaration of a class name takes effect immediately after the identifier is seen in the class definition or elaborated-type-specifier [...] —end note ].

This will in turn cause the declaration of interface to be instantiated, but not its definition —which is the one making use of Derived—. That will happen when and if interface is used in a context that requires it to be defined.

[14.7.1/10] An implementation shall not implicitly instantiate a function template, a variable template, a member template, a non-virtual member function, a member class, or a static data member of a class template that does not require instantiation. [...]

By the time interface is used, both base<derived> and derived would have been defined already. If the compilation process could be traced, this would be the output:

  1. derived is declared,
  2. base<derived> is instantiated,
  3. base<derived>::interface declaration is instantiated,
  4. derived is defined,
  5. derived::implementation is defined,
  6. base<derived>::interface definition is instantiated —only if needed— .

Structure Inheritance

From Object Oriented Software Construction —the book on Eiffel—:

Structure inheritance applies if A, a deferred class, represents a general structural property and B, which may be deferred or effective, represents a certain type of objects possessing that property.

Usually the structural property represented by A is a mathematical property that a certain set of objects may possess; for example A may be the class COMPARABLE, equipped with such operations as infix "<" and infix ">=", representing objects to which a total order relation is applicable. A class that needs an order relation of its own, such as STRING, will inherit from COMPARABLE.

In its simplest form, structure inheritance carries information about a specific property of a type. By taking advantage of CRTP, it is possible to retain at compile time the actual type being used:

template <typename Derived>
struct some_concept { /*...*/ };

struct foo : some_concept<foo> { /*...*/ };

template <typename Derived>
void do_something(some_concept<Derived>& p) {
  Derived& d = static_cast<Derived&>(p); // get the actual object back

  /*...do something with d, knowing that it models some_concept...*/
}

Consider the proposed example of types with an order relation, which makes them comparable. Knowing that all relational operators can be implemented in terms of less than, this looks like a perfect opportunity to use inheritance to provide all four operators for the price of one. A naïve attempt at that would look like this:

class comparable {
private:
  virtual bool less_than(comparable const& other) const = 0;

public:
  friend bool operator <(comparable const& lhs, comparable const& rhs) {
    return lhs.less_than(rhs);
  }
  friend bool operator >(comparable const& lhs, comparable const& rhs) {
    return rhs.less_than(lhs);
  }
  friend bool operator <=(comparable const& lhs, comparable const& rhs) {
    return !rhs.less_than(lhs);
  }
  friend bool operator >=(comparable const& lhs, comparable const& rhs) {
    return !lhs.less_than(rhs);
  }
};

comparable const& max(comparable const& lhs, comparable const& rhs) {
  return lhs.less_than(rhs) ? rhs : lhs;
}

class foo : public comparable {
private:
  bool less_than(comparable const& other_comparable) const override {
    foo const& other = dynamic_cast<foo const&>(other_comparable); // throws if not compatible
    /*...*/
  }
};

This is all kinds of wrong, but the worst of them all is the introduced relationship between all comparable types. Since only the comparable base is considered, it is possible to compare apples and oranges only to find out at runtime that this is a logic error —a std::complex is comparable and so is a std::string, but one is not necessarily comparable against the other—. All that can be done to handle a mixed type situation is to check the dynamic type of the arguments, and throw a std::bad_cast exception when they are not compatible.

A type safe solution can be implemented using templates. Comparable is not a class after all, but a family of classes:

template <typename Derived>
class comparable {
public:
    friend bool operator <(comparable<Derived> const& clhs, comparable<Derived> const& crhs) {
        Derived const& lhs = static_cast<Derived const&>(clhs);
        Derived const& rhs = static_cast<Derived const&>(crhs);

        return lhs.less_than(rhs);
    }
    friend bool operator >(comparable<Derived> const& clhs, comparable<Derived> const& crhs) {
        Derived const& lhs = static_cast<Derived const&>(clhs);
        Derived const& rhs = static_cast<Derived const&>(crhs);

        return rhs.less_than(lhs);
    }
    friend bool operator <=(comparable<Derived> const& clhs, comparable<Derived> const& crhs) {
        Derived const& lhs = static_cast<Derived const&>(clhs);
        Derived const& rhs = static_cast<Derived const&>(crhs);

        return !rhs.less_than(lhs);
    }
    friend bool operator >=(comparable<Derived> const& clhs, comparable<Derived> const& crhs) {
        Derived const& lhs = static_cast<Derived const&>(clhs);
        Derived const& rhs = static_cast<Derived const&>(crhs);

        return !lhs.less_than(rhs);
    }
};

template <typename Derived>
Derived const& max(comparable<Derived> const& lhs, comparable<Derived> const& rhs) {
    comparable<Derived> const& result = lhs.less_than(rhs) ? rhs : lhs;

    return static_cast<Derived const&>(result);
}

class foo : public comparable<foo> {
public:
    bool less_than(foo const& other) const { /*...*/;  }
};

It should be noticed how less_than had to be made public in order for comparable<foo> to be able to access it. The base class does not have special access to its derived class —it is not even supposed to know about it—. To keep the implementation private, comparable<foo> would have to be made a friend or granted access in some other way.

Additionally, max will now preserve the type of the arguments in the return value, instead of just dropping the static type information.

With this approach, any attempt to compare objects of different types will result in a compilation error —comparable<std::complex> is a totally different type than comparable<std::string>—. As a bonus, this alternative makes no use of virtual functions nor dynamic_casts, resulting a bit more efficient both in memory usage and execution time. In fact, given that comparable<T> is a class with no members, it is a perfect candidate for the Empty Base Optimization.

Facility inheritance

Once again, from Object Oriented Software Construction:

Facility inheritance applies if A exists solely for the purpose of providing a set of logically related features for the benefit of heirs such as B.

While still on the comparable example, a cleaner solution can be achieved by using comparable<> merely to provide additional functionality. Since comparable<> knows about the derived class, it can implement said functionality in terms of the interface of the derived class itself:

template <typename Derived>
class comparable {
public:
    friend bool operator >(Derived const& lhs, Derived const& rhs) {
        return rhs < lhs;
    }
    friend bool operator <=(Derived const& lhs, Derived const& rhs) {
        return !(rhs < lhs);
    }
    friend bool operator >=(Derived const& lhs, Derived const& rhs) {
        return !(lhs < rhs);
    }
};

class foo : private comparable<foo> {
public:
    friend bool operator <(foo const& lhs, foo const& rhs) {
        /*...*/;
    }
};

using std::max;

The inheritance is now private, since it is only an implementation detail as opposed to part of the interface. If the base class provides member functions, those can be reintroduced as public via using declarations.

Finally, there is little point in reimplementing max anymore, the one provided by the standard library will do just fine.

These techniques do not limit to operators, but they do happen to fit the model nicely —before you go reinventing the wheel, make sure to check Boost.Operators—.

Summary

The curiously recurring template pattern is an idiom in which the base class knows the static type of its derived class, given as a template parameter, allowing the base class to defer the implementation to the derived class.

  • A class declaration introduces a name for an incomplete type.
  • A class definition completes the named type.
  • Non-virtual functions of a template class are instantiated when used.
  • Structure and facility inheritance can be modeled with CRTP without loosing static type information.

References: