31 Jul, 2013

True Story: I Will Always Find You

A customization point is a code construct that the user can leverage to specialize how a particular library action is handled. A common way of implementing this in C++ is to define a template with the default behavior, and let users specialize it for their own types —e.g., std::hash—. This is the story of Spirit X3 and how it lets you specialize customization points without ever leaving your own namespace, sort of...

The Scene

Spirit is a set of C++ libraries for parsing and output generation implemented as Domain Specific Embedded Languages (DSEL) using Expression templates and Template Meta-Programming. The Spirit libraries enable a target grammar to be written exclusively in C++. Inline grammar specifications can mix freely with other C++ code and, thanks to the generative power of C++ templates, are immediately executable.

A key customization point in Spirit X3 is as_parser, specifying which types are parsers or convertible to parsers, and how said conversion is done. Consider this initial pseudocode definition:

namespace boost { namespace spirit { namespace x3 { namespace extension {

  // when T can be converted into a parser
  template <typename T, typename Enable = void>
  struct as_parser
  {
    typedef unspecified type;
    typedef unspecified value_type;
    static type call(T){ /*...*/ }
  };

  // when T cannot be converted into a parser
  template <typename T, typename Enable = void>
  struct as_parser
  {};

} } } }
  • The template argument T is the type we want to convert to a parser.
  • The template argument Enable allows for SFINAE to be used for finer grain specialization.
  • When the template argument T can be converted to a parser:
    • type is the type of the parser T is converted to,
    • value_type is the type used when storing the parser type,
    • call is the function called to do the conversion from T to type.
  • When the template argument T cannot be converted to a parser the struct is empty.

Here is how one would make the library aware of user-defined parsers:

namespace geometry {

  struct point {
    int x, y;          
  };

  /** parse two comma separated integers into the attribute members x and y **/
  struct point_parser {
    typedef point attribute_type;

    template <typename Iterator, typename Context, typename Attribute>
    bool parse(Iterator& first, Iterator const& last, Context const& context, Attribute& attr) const
    {
        Iterator save = first;
        if (!x3::parse(first, last, x3::int_ >> ',' >> x3::int_, std::tie(attr.x, attr.y))) {
          first = save;
          return false;
        }
        return true;
    }
  };

} // close geometry namespace 

namespace boost { namespace spirit { namespace x3 { namespace extension { // open extension namespace

  template <>
  struct as_parser<geometry::point_parser>
  {
    typedef geometry::point_parser const& type; // already a parser
    typedef geometry::point_parser value_type; // store by value
    static type call(geometry::point_parser const& p){ return p; } // pass-through
  };

} } } } // close extension namespace

namespace geometry { // reopen geometry namespace

  /** parse two comma separated integers and compare against the given point members x and y **/
  struct literal_point_parser {
    typedef x3::unused_type attribute_type;

    explicit literal_point_parser(point const& value) : value{value} {}

    template <typename Iterator, typename Context, typename Attribute>
    bool parse(Iterator& first, Iterator const& last, Context const& context, Attribute& /*attr*/) const
    {
        point attr;
        Iterator save = first;
        if (!x3::parse(first, last, x3::int_(attr.x) >> ',' >> x3::int_(attr.y))) {
          first = save;
          return false;
        }
        return true;
    }

  private:
    point value;
  };

} // close geometry namespace 

namespace boost { namespace spirit { namespace x3 { namespace extension { // reopen extension namespace

  template <>
  struct as_parser<geometry::literal_point_parser>
  {
    typedef geometry::literal_point_parser const& type; // already a parser
    typedef geometry::literal_point_parser value_type; // store by value
    static type call(geometry::literal_point_parser const& p){ return p; } // pass-through
  };
  template <>
  struct as_parser<geometry::point>
  {
    typedef geometry::literal_point_parser type; // use literal_point_parser as parser
    typedef geometry::literal_point_parser value_type; // store by value
    static type call(geometry::point const& p){ return type{p}; } // convert to literal_point_parser
  };

} } } } // close extension namespace

namespace geometry { // as we were

  /*...*/

}

Notice how we have to close our namespace and go into the extension one, where we have to use fully qualified names for our types. This is a little inconvenient, but it is nonetheless an inconvenience. The intention is then to specify the semantics for as_parser without ever leaving the user namespace, and have the library somehow find it. By inheriting from x3::parser we identify the parsers as such, and resolve two-thirds of the issue —as there is already a specialization for all things based in x3::parser—. The remaining case is that of a point being used in a parser-expression, where it should be interpreted as using a literal_point_parser initialized with the value of said point.

The Actors

Argument Dependent Lookup

Argument Dependent Lookup, also known as Koenig Lookup, can see through some namespaces. When resolving an unqualified function call, names in a set of associated namespaces are visible as well. Such set of associated namespaces includes the namespace of which the class of each argument is a member of, as well as those of their direct and indirect base classes, and the namespaces associated with the types of the template arguments provided for template type parameters.

[3.4.2/1] When the postfix-expression in a function call (5.2.2) is an unqualified-id, other namespaces not considered during the usual unqualified lookup (3.4.1) may be searched, and in those namespaces, namespace-scope friend function or function template declarations (11.3) not otherwise visible may be found. These modifications to the search depend on the types of the arguments (and for template template arguments, the namespace of the template argument). [ Example:

namespace N {
    struct S { };
    void f(S);
  }
  
  void g() {
    N::S s;
    f(s);   // OK: calls N::f
    (f)(s); // error: N::f not considered; parentheses
            // prevent argument-dependent lookup
  }

—end example ]

[3.4.2/2] For each argument type T in the function call, there is a set of zero or more associated namespaces and a set of zero or more associated classes to be considered. The sets of namespaces and classes is determined entirely by the types of the function arguments (and the namespace of any template template argument). Typedef names and using-declarations used to specify the types do not contribute to this set. [...]

This is the machinery that allows the use of overloaded operators in an unqualified way, provided that they are defined in an associated namespace of at least one of the arguments. Argument dependent lookup does not restrict to operators only, it applies to all unqualified function calls.

Two-Phase Name Lookup

A template definition is comprised by pieces that depend on one or more of the template parameters and will change among instantiations, and pieces that do not depend on any template parameters and will always be the same. Ordinary name lookup applies for non dependent names, and what is known as Late Name Lookup applies for dependent names. Dependent names will be resolved at the point of instantiation —usually the point of first use in a context that requires a complete type—.

[14.6/1] Three kinds of names can be used within a template definition:

  • The name of the template itself, and names declared within the template itself.
  • Names dependent on a template-parameter (14.6.2).
  • Names from scopes which are visible within the template definition.

[14.6.2/1] Inside a template, some constructs have semantics which may differ from one instantiation to another. Such a construct depends on the template parameters. In particular, types and expressions may depend on the type and/or value of template parameters (as determined by the template arguments) and this determines the context for name lookup for certain names. Expressions may be type-dependent (on the type of a template parameter) or value-dependent (on the value of a non-type template parameter). In an expression of the form:

postfix-expression ( expression-listopt )

where the postfix-expression is an unqualified-id, the unqualified-id denotes a dependent name if

  • any of the expressions in the expression-list is a pack expansion (14.5.3),
  • any of the expressions in the expression-list is a type-dependent expression (14.6.2.2), or
  • if the unqualified-id is a template-id in which any of the template arguments depends on a template parameter.

If an operand of an operator is a type-dependent expression, the operator also denotes a dependent name. Such names are unbound and are looked up at the point of the template instantiation (14.6.4.1) in both the context of the template definition and the context of the point of instantiation.

[14.6.3/1] Non-dependent names used in a template definition are found using the usual name lookup and bound at the point they are used. [...]

Within a template definition, the type of arguments in a function call may depend on the template parameters. An unqualified function call may as well depend on the types of its arguments. When both situations happen in the same function call expression, there is no way to resolve it until the point of instantiation when actual argument types are known.

decltype

As of C++11, decltype is an operator for querying the type of an expression.

[7.1.6.2/4] The type denoted by decltype(e) is defined as follows:

  • if e is an unparenthesized id-expression or an unparenthesized class member access (5.2.5), decltype(e) is the type of the entity named by e. If there is no such entity, or if e names a set of overloaded functions, the program is ill-formed;
  • otherwise, if e is an xvalue, decltype(e) is T&&, where T is the type of e;
  • otherwise, if e is an lvalue, decltype(e) is T&, where T is the type of e;
  • otherwise, decltype(e) is the type of e.

The operand of the decltype specifier is an unevaluated operand (Clause 5). [...]

The Solution

As it should be obvious by now, the solution consist of all three actors playing together at once. The interesting part is how they are intertwined.

First, we choose a name to act as our customization point. Since this name will be used with argument dependent lookup, we choose a slightly different name that includes the name of our library. We then define a function with said name in its own implementation detail namespace, with a return type that cannot possibly be a parser. For this particular case, void will do just fine; otherwise, a not_a_parser dummy type can be used.

namespace boost { namespace spirit { namespace x3 { namespace extension { namespace as_parser_detail {

  void as_spirit_parser(...); // C style variadic function, not C++ variadic function template
                              // worst possible candidate in an overloaded set

} } } } }

Then, we define a helper struct that will, given a type T, locate an overload of the customization point if one exists in an associated namespace of T. Such an overload will be dependent on the template parameter T, hence the name lookup will be deferred until the point of instantiation. The return type of such overload is obtained by using the decltype operator.

namespace boost { namespace spirit { namespace x3 { namespace extension { namespace as_parser_detail {

  template <typename T, typename R = decltype(as_spirit_parser(std::declval<T const&>()))>
  struct deduce_as_parser {
    typedef R type; // the parser type is the return type of the customization point
    typedef typename std::decay<R>::type value_type; // store by decaying the parser type

    static type call(T const& v) {
      return as_spirit_parser(v); // forward to the customization point
    }
  };

If no overload of the customization point is found, the previously defined worst match with a return type of void will be found. In that case, T is not a parser so the struct will be empty.

  template <typename T>
  struct deduce_as_parser<T, void>
  {};

} } } } }

Finally, we make the original as_parser customization point inherit from deduce_as_parser if not specialized. This way, the user can choose whether to use the original or the new customization point. If both are used for the same type, then the original customization point takes precedence.

namespace boost { namespace spirit { namespace x3 { namespace extension {

  template <typename T, typename Enable = void>
  struct as_parser : as_parser_detail::deduce_as_parser<T> {};

} } } }

With all this in place, our example now looks like this:

namespace geometry {

  struct point {
    int x, y;          
  };

  struct point_parser : x3::parser<point_parser> { // already a parser
    /*as before*/
  };
  struct literal_point_parser : x3::parser<literal_point_parser> { // already a parser 
    /*as before*/
  }
  literal_point_parser as_spirit_parser(point const& p){ // use literal_point_parser as parser
    return literal_point_parser{p};
  }

  // as we were
  /*...*/

}