30 Jun, 2014

Episode Nine: Erasing the Concrete

Templates are an extremely powerful C++ feature that enables writing generic implementations that are also highly efficient. By instantiating the implementation for each combination of template arguments used it is possible to generate code equivalent or close to that of a hand-written implementation. However, this type-awareness that opens the door to efficiency also closes the door to crossing virtual or binary interfaces, as well as preventing other everyday activities like storing things in a container. Thus, it is often necessary to collapse all this generality into a single type, effectively erasing the concrete type of the objects involved.

Any technique in which a set of types is represented via a single type is effectively performing type erasure. There are several type erasure techniques, and they are not exclusive to C++.

Unions

A union is the simplest form of type erasure, given that it can hold any one of its non-static data members at a given time. A discriminated or tagged union —such as Boost.Variant— augments the basic union by also packing some form of identifier of the member currently in use. This allows to probe the kind of contained member, and switch on it to perform different actions as appropriate.

This kind of type erasure is bounded, as it requires all members of the set of types to be known at the point of declaration of the union. While this is an ideal —and efficient— solution in certain contexts, it is of little help in generic context where the participating types will not be known in advance. The remaining presented techniques are unbounded, meaning they allow the set of types to be extended at different and independent points in the code.

void Pointers

The low level form of type erasure is that of a void pointer. Such a pointer can point to objects of any type, and has no knowledge of which type that is. The concrete type of the pointee has to be kept externally, explicitly or implicitly, since doing anything useful with it requires casting the pointer appropriately. The pointer deals exclusively with storage; when functionality is required, it is often associated with function pointers taking void*s as arguments, and performing actions on them after casting them to pointers to their concrete types —e.g., std::qsort—. Since this technique basically circumvents the type system, its biggest downside is that it is not type safe; dereferencing the pointer after casting it to the wrong type will result in undefined behavior.

By providing a void* member or argument —along with sufficient event hooks or callbacks—, an API can give the user a way to propagate additional information without having to be aware of its type. This technique is widely used in C, were higher level alternatives are not available, but it has found its way into C++ at IOStreams.

ios_base Storage Functions and Callbacks

IOStreams provides a set of well known manipulators to operate on streams, and it also provides the —not so well known—infrastructure for users to extend them. Manipulators often need to store state of their own into the target stream, and that is accomplished by what is essentially a std::vector<void*> —as well as a std::vector<long>—within each stream. A manipulator requires a slot from that storage by invoking std::ios_base::xalloc.

[27.5.3.5] ios_base storage functions

static int xalloc();
  • Returns: index++.

The returned index is unique program-wide, though it might be different on successive runs. Such index can be used on any stream as an argument to std::ios_base::pword to obtain a reference to a void* —and similarly std::ios_base::iword to obtain a reference to a long—, which can be used in any way the manipulator sees fit, without the library having to know anything about it.

void*& pword(int idx);
  • Effects: If parray is a null pointer, allocates an array of pointers to void of unspecified size and stores a pointer to its first element in parray. The function then extends the array pointed at by parray as necessary to include the element parray[idx]. [...]

  • Returns: On success parray[idx]. On failure a valid void*& initialized to 0.

The state stored by manipulators may need to be copied and eventually destroyed during the lifetime of the stream. The library cannot perform those actions by itself, since it does not know the type of the state. The user has to fill the holes via std::ios_base::register_callback.

[27.5.3.6] ios_base callbacks

void register_callback(event_callback fn, int index);
  • Effects: Registers the pair (fn,index) such that during calls to imbue() (27.5.3.3), copyfmt(), or ˜ios_base() (27.5.3.7), the function fn is called with argument index. Functions registered are called when an event occurs, in opposite order of registration. Functions registered while a callback function is active are not called until the next event.

  • Requires: The function fn shall not throw exceptions.

Putting all the pieces together, what follows is a user-defined manipulator that stores additional data within a stream, without the stream ever knowing about the concrete types involved:

struct point_manipulator {
  enum delimiter { delimit, open, close, _count };

public:
  explicit point_manipulator(
      std::string const& delimit = " "
    , std::string const& open = "("
    , std::string const& close = ")")
   : _delimiters{delimit, open, close}
  {}

  friend std::ostream& operator<<(std::ostream& stream, delimiter d) {
    point_manipulator* manip = static_cast<point_manipulator*>(stream.pword(index()));
    if (manip != nullptr) {
      stream << manip->_delimiters[d]; // use the embedded manipulator
    } else {
      stream << point_manipulator{}._delimiters[d]; // use the default values
    }
    return stream;
  }

  friend std::ostream& operator<<(std::ostream& stream, point_manipulator const& m) {
    point_manipulator* manip = static_cast<point_manipulator*>(stream.pword(index()));
    if (manip != nullptr) {
      *manip = m; // replace the embedded manipulator
    } else {
      stream.pword(index()) = new point_manipulator{m}; // embed a new manipulator
      stream.register_callback(&point_manipulator::callback, 0);
    }
    return stream;
  }

private:
  static int index() {
    static int const index = std::ios_base::xalloc();
    return index;
  }

  static void callback(std::ios_base::event evt, std::ios_base& ios, int) noexcept {
      point_manipulator* manip = static_cast<point_manipulator*>(ios.pword(index()));
      switch (evt) {
      case std::ios_base::erase_event: {
        delete manip; // delete the embedded manipulator
        break;
      }
      case std::ios_base::copyfmt_event: {
        ios.pword(index()) = new point_manipulator{*manip}; // clone the manipulator
        break;
      }
      }
  }

private:
  std::string _delimiters[_count];
};

Even when ignoring several orthogonal concepts —errors, locales, different character types—, the implementation of a simple manipulator is considerably complex. Using it, however, is fairly trivial:

struct point { float x, y; };

std::ostream& operator<<(std::ostream& stream, point const& p) {
  return stream
    << point_manipulator::open << p.x
    << point_manipulator::delimit << p.y
    << point_manipulator::close;
}

std::cout << point{1, 2}; // outputs "(1 2)"
std::cout << point_manipulator{", ", "<", ">"} << point{1, 2}; // outputs "<1, 2>"

It should be noted that had the manipulator decided to represent delimiters as single characters instead of std::strings, it could have made use of the long attribute instead —or several of them—. That way, memory allocations could be avoided, and a callback would not be needed since character types are trivial types.

Virtual functions

Inheritance and virtual functions allow for a high level form of type erasure, one that is type safe. The goal of dynamic polymorphism is to provide a single interface to entities of different types, thus effectively erasing those types. Under the hood, there are still void pointers and a bunch of —per-type— function pointers, but those lower level details are filled in by the compiler. Additionally, the compiler also generates an instance of std::type_info so that the dynamic type of the pointee can be retrieved.

When given a definition of a class with virtual functions, the compiler synthesizes a virtual table with all the pertaining information to support dynamic dispatch.

class closure {
public:
  virtual closure* clone() const = 0;
  virtual void invoke() = 0;
  virtual ~closure() {}
};

When a virtual function —like invoke— is called using a pointer or reference to closure, the actual function implementation will be looked up at the virtual table based on the dynamic type of the object. It is then possible to write code in terms of just the interface; such code operates on types of which it knows nothing but that they do conform to the interface:

// invoke a closure n times
void invoke_repeat(closure& c, std::size_t n) {
  for (std::size_t i = 0; i < n; ++i)
    c.invoke();
}

// concrete closure
class greeter : public closure {
public:
  explicit greeter(std::string const& name)
    : _name{name}
  {}

  virtual greeter* clone() const override { // covariant return types are awesome
    return new greeter{*this};
  }

  void invoke() override {
    std::cout << "Hello, " << _name << '!';
  }

private:
  std::string _name;
};

greeter g{"World"};
invoke_repeat(g, 3); // outputs "Hello, World!Hello, World!Hello, World!"

This is already a great improvement in simplicity and readability, but storing instances of these type-erased objects still requires dealing with pointers, which is not entirely straightforward. A std::unique_ptr will simplify handling the polymorphic object lifetime; and, given that the interface provides a clone function, some sort of clone_ptr —like the one introduced here as an example— would provide natural semantics for copies too.

Another shortcoming of this approach is that a type has to explicitly and intrusively declare itself to conform to a certain interface. The explicitness is useful when implementing models of a concept with semantic behavior, but is redundant when the concept is strictly syntactical —e.g., whether the expression v() is well-formed for an object v of type T—. The intrusiveness, however, makes it impractical in all situations. After all, without a standardized interface, it is impossible for two libraries that don't know about each other to provide compatible implementations. In the same way, it is impossible to declare conformance to the closure interface for function pointers, since they aren't even classes; or lambdas, since their definition is generated by the compiler —and their types unspecified—. In those situations, the only viable option is to handcraft wrappers to adapt a type to the expected interface.

Run-Time Type Information

The dynamic type of a polymorphic object can be queried by means of the typeid operator:

[5.2.8/1] The result of a typeid expression is an lvalue of static type const std::type_info (18.7.1) [...]. The lifetime of the object referred to by the lvalue extends to the end of the program. [...}

[5.2.8/2] When typeid is applied to a glvalue expression whose type is a polymorphic class type (10.3), the result refers to a std::type_info object representing the type of the most derived object (1.8) (that is, the dynamic type) to which the glvalue refers. [...]

The resulting object holds implementation-specific information about the dynamic type, including the name of the type and means to compare two types for equality or collating order. It should be noted that the name of the type is an implementation defined string, which provides no further guarantees. The returned string could be identical for several types, or even empty:

[18.7.1] Class type_info

const char* name() const noexcept;
  • Returns: An implementation-defined NTBS [null-terminated byte string].

A dynamic_cast is a type of conversion for pointers and references that can make use of the run-time type information. For the simplest cases, where the type system provides enough information at compile time, it behaves just like a static_cast; when the conversion is down or across the inheritance tree on an object of polymorphic type, however, a run-time check is performed to determine whether the operation can be performed:

[5.2.7/1] The result of the expression dynamic_cast<T>(v) is the result of converting the expression v to type T. T shall be a pointer or reference to a complete class type, or "pointer to cv void".

[5.2.7/7] If T is "pointer to cv void", then the result is a pointer to the most derived object pointed to by v. Otherwise, a run-time check is applied to see if the object pointed or referred to by v can be converted to the type pointed or referred to by T.

[5.2.7/8] If C is the class type to which T points or refers, the run-time check logically executes as follows:

  • If, in the most derived object pointed (referred) to by v, v points (refers) to a public base class subobject of a C object, and if only one object of type C is derived from the subobject pointed (referred) to by v the result points (refers) to that C object.

  • Otherwise, if v points (refers) to a public base class subobject of the most derived object, and the type of the most derived object has a base class, of type C, that is unambiguous and public, the result points (refers) to the C subobject of the most derived object.

  • Otherwise, the run-time check fails.

[5.2.7/9] The value of a failed cast to pointer type is the null pointer value of the required result type. A failed cast to reference type throws an exception (15.1) of a type that would match a handler (15.3) of type std::bad_cast.

The use of dynamic_cast is often considered a sign of bad interface design [citation needed]. A seemingly genuine use case is that of probing a polymorphic object for support of a more refined interface providing specialized functionality —as opposed to querying for concrete types—.

Classes All the Way Down

In —so called— pure object oriented languages such as Java, every object derives from a universal base class generally named Object. This base class is a slightly better void*, since it knows the dynamic type of the pointee and can trigger a runtime error if casted incorrectly. In these languages, the term type erasure refers to a technique that avoids the explosion of generated code from different instantiations of the same generic classes. Given a generic class, each occurrence of an unbound type parameter is replaced by Object, and type casts are inserted as necessary to preserve types. This decouples the generated code from the parameters, so that all instantiations of a generic class share the same single generated code.

This optimization is also useful in C++, provided one deals exclusively with pointers —so not that useful—. This is leveraged by Boost.PointerContainer, where a boost::ptr_xxx<T> container is backed by a std::xxx<void*> container instead of a std::xxx<T*> one, in order to reduce the number of template instantiations.

Value Semantics

In the C++ lands, value semantics rule and inheritance is an implementation detail. In these lands, the term type erasure is strongly associated with this last technique, that builds on top of the others to add value semantics. What follows is a definition of closure using this technique —in what is basically a simplified and watered down version of std::function<void()>—.

Dynamic polymorphism is still used. The interface is exactly the same as before, except that it is private to the wrapper:

class closure {
private:
  struct interface {
    virtual interface* clone() const = 0;
    virtual void invoke() = 0;
    virtual ~interface() {}
  };

  /*...*/
};

The concrete implementation, also private to the wrapper, is templated on the target type. It will be instantiated as needed for each type used to initialize the wrapper:

class closure {
  /*...*/

  template <typename Target>
  struct implementation final : interface {
    explicit implementation(Target const& target)
      : _target(target)
    {}

    explicit implementation(Target&& target)
      : _target(std::move(target))
    {}

    implementation* clone() const override {
      return new implementation{*this};
    }

    void invoke() override {
      _target();
    }

    Target _target;
  };

  /*...*/
}

Finally, the wrapper simply initializes the polymorphic object, and then forwards calls to it as appropriate:

class closure {
  /*...*/

private:
  clone_ptr<interface> _content;

public:
  template <
    typename Target
  , typename DecayTarget = std::decay_t<Target>
  , typename Sfinae = decltype(std::declval<DecayTarget>()())
      // Disable this converting constructor if `target()` is ill-formed
  , typename Enable =
      std::enable_if_t<!std::is_base_of<closure, DecayTarget>::value>
      // See Universal References and the Copy Constructor - Eric Niebler
      // http://ericniebler.com/2013/08/07/universal-references-and-the-copy-constructo/
  >
  closure(Target&& target)
    : _content{new implementation<DecayTarget>{std::forward<Target>(target)}}
  {}

  closure(closure const& other) = default;

  void operator()() const {
    _content->invoke();
  }
};

This technique combines most of the strengths of templates with those of dynamic polymorphism. The mechanisms involved are not new, they are simply repackaged and hidden away from the user as implementation details. The resulting wrappers presents an interface that feels natural in Modern C++, which follows value semantics and has no naked pointers in sight. They can be stored in containers, and be used across virtual and binary interfaces. They also serve as a compilation wall, the erasure of the concrete type prevents or reduces the proliferation of different template instantiations, at the cost of some call overhead and less optimization opportunities.

It should be no surprise that Boost provides a library to facilitate the construction of these type-erased wrappers. Boost.TypeErasure provides an any class that can store objects that meet whatever requirements are specified. These requirements are expressed via concepts meta-types; the library provides a set of common ones, and users can defined their own ones as well. Most concepts can be specialized in order to achieve what effectively are concept maps.

Target Access

By making polymorphism an implementation detail, the run-time type information facilities were hidden as well. The interesting functionality of the target is already exposed by the public interface; nevertheless, std::function provides target access to the users by means of the following member functions:

[20.9.11.2.5] function target access

const std::type_info& target_type() const noexcept;
  • Returns: If *this has a target of type T, typeid(T); otherwise, typeid(void).

This function is equivalent to typeid(Target), with a special case since the standard wrapper can possibly be empty.

template<class T> T* target() noexcept;
  template<class T> const T* target() const noexcept;
  • Returns: If target_type() == typeid(T) a pointer to the stored function target; otherwise a null pointer.

This function provides only the basic functionality of dynamic_cast; since the exact target type has to be provided its use is even more questionable.

Given the previous definition of closure, this extra functionality can be implemented as follows:

class closure {
private:
  struct interface {
    /*...*/
    virtual std::type_info const& target_type() const noexcept = 0;
    virtual void* target(std::type_info const& type) noexcept = 0;
  };

  template <typename Target>
  struct implementation final : interface {
    /*...*/

    std::type_info const& target_type() const noexcept override {
      return typeid(Target);
    }

    void* target(std::type_info const& type) noexcept override {
      return typeid(Target) == type ? &_target : nullptr;
    }

    Target _target;
  };

private:
  clone_ptr<interface> _content;

public:
  /*...*/

  std::type_info const& target_type() const noexcept {
    return _content->target_type();
  }

  template <typename T>
  T* target() noexcept {
    return static_cast<T*>(_content->target(typeid(T)));
  }

  template <typename T>
  T const* target() const noexcept {
    return static_cast<T const*>(_content->target(typeid(T)));
  }
};

A note on Small Buffer Optimization

A common optimization is the Small Buffer Optimization, that consists of having embedded storage within the wrapper to store small target objects. This allows skipping a costly dynamic allocation, as well as increasing data locality. The standard suggest this optimization to implementors of std::function for small objects, while it requires it for function pointers and instances of std::reference_wrapper:

[20.9.11.2.1] function construct/copy/destroy

template<class F> function(F f);
  template <class F, class A> function(allocator_arg_t, const A& a, F f);
  • Postconditions: [...], *this targets a copy of f initialized with std::move(f). [Note: Implementations are encouraged to avoid the use of dynamically allocated memory for small callable objects, for example, where f’s target is an object holding only a pointer or reference to an object and a member function pointer. —end note]

  • Throws: shall not throw exceptions when f is a function pointer or a reference_wrapper<T> for some T. Otherwise, may throw bad_alloc or any exception thrown by F’s copy or move constructor.

Given that the constructor is not allowed to throw exceptions —not even std::bad_alloc—, dynamic memory allocation is ruled out so the target has to be stored in a buffer within the wrapper itself.

Summary

Type erasure is any technique in which a single type can be used to represent a wide variety of types that share a common interface. In the C++ lands, the term type-erasure is strongly associated with the particular technique that uses templates in the interface and dynamic polymorphism in the implementation.

  • A union is the simplest form of type erasure.

    • It is bounded, and all participating types have to be mentioned at the point of declaration.
  • A void pointer is a low-level form of type erasure. Functionality is provided by pointers to functions that operate on void* after casting it back to the appropriate type.

    • It is unbounded, but type unsafe.
  • Virtual functions offer a type safe form of type erasure. The underlying void and function pointers are generated by the compiler.

    • It is unbounded, but intrusive.
    • Has reference semantics.
  • A template based form of type erasure provides a natural C++ interface. The implementation is built on top of dynamic polymorphism.

    • It is unbounded and unintrusive.
    • Has value semantics.

References:

Inheritance Is The Base Class of Evil - Sean Parent

Type Erasure — Part I, Andrzej Krzemienski

Type Erasure — Part II, Andrzej Krzemienski

Type Erasure — Part III, Andrzej Krzemienski

Type Erasure — Part IV, Andrzej Krzemienski