19 Jun, 2013

Episode Five: Explicit is Better than Implicit

The Zen of Python tell us that Explicit is better than Implicit. This is good advice for any and all languages, but what are the implications in the C++ lands? There is one particular C++ feature that relates directly to that advice; one important enough that grants on its own the introduction of a keyword. That feature is user-defined conversions and that keyword is explicit.

Conversions

Type conversion is an implicit or explicit operation that changes an instance of one type into another. C defines a number of standard conversions, including:

  • Numeric promotions, applied when a numeric value is used in a context where a numeric value of a larger type is expected. Those promotions are integral promotion, and floating point promotion. A particular case is that of built-in arithmetic operators, which do not accept types smaller than int as arguments, and integral promotions are automatically applied. This conversion always preserves the value.

  • Numeric conversions, applied when a numeric value is used in a context where a numeric value of a different type is expected, and a promotion cannot be applied. Those conversions are integral conversions, floating point conversions, floating - integral conversions, pointer conversions, and boolean conversions. Unlike the promotions, numeric conversions may change the values, with potential loss of precision.

  • Lvalue transformations, applied when an lvalue argument is used in context where an rvalue is expected. Those transformations are lvalue to rvalue conversion, array to pointer conversion, and function to pointer conversion. This is the type conversion applied to all function arguments when passed by value, and it is customary referred to as argument decay.

C++ inherits all of these and adds a few of its own —the standard devotes an entire clause to standard conversions, [Clause 4]—. In the spirit of supporting user-defined types as first class citizens, it allows said user-defined types to specify ways in which they can be converted from/to a different type. Those conversions are unsurprisingly known as user-defined conversions:

[12.3/1] Type conversions of class objects can be specified by constructors and by conversion functions. These conversions are called user-defined conversions and are used for implicit type conversions (Clause 4), for initialization (8.5), and for explicit type conversions (5.4, 5.2.9).

[12.3/4] At most one user-defined conversion (constructor or conversion function) is implicitly applied to a single value. [ Example:

struct X {
    operator int();
  };
  
  struct Y {
    operator X();
  };
  
  Y a;
  int b = a; // error
             // a.operator X().operator int() not tried
  int c = X(a); // OK: a.operator X().operator int()

—end example ]

User-defined conversions come in two flavors: conversion by construction, and conversion functions. The former is used to declare that a type T can be converted from a type U, while the later is used to declare that a type T can be converted to a type U. Both forms are achieved by the definition of special member functions, which means that conversion operations cannot be added to a class after it has been defined—i.e., user-defined conversions are intrusive—.

Conversion by Construction

The rules are simple, any non-explicit constructor is a converting constructor:

[12.3.1/1] A constructor declared without the function-specifier explicit specifies a conversion from the types of its parameters to the type of its class. Such a constructor is called a converting constructor. [ Example:

struct X {
    X(int);
    X(const char*, int =0);
    X(int, int);
  };
  
  void f(X arg) {
    X a = 1; // a = X(1)
    X b = "Jessie"; // b = X("Jessie",0)
    a = 2; // a = X(2)
    f(3); // f(X(3))
    f({1, 2}); // f(X(1,2))
  }

—end example ]

It used to be the case, before the introduction of list-initialization, that a converting constructor was any non-explicit constructor that could be called with a single argument. Note that this is broader than constructors with a single parameter, as it also includes constructors with more than one parameters where at least all but one specify a default argument —and the grammar requires that parameters without default arguments precede those with default arguments—.

With the advent of list-initialization, a converting constructor can be called with more than just one argument. List-initialization is initialization of an object or reference from a braced-init-list. A braced-init-list is a syntactic construct that aggregates any number of arguments within braces.

The following example from the standard library, one of std::complex constructors, has it all:

[26.4.4] complex member functions

template<class T> constexpr complex(const T& re = T(), const T& im = T());
  • Effects: Constructs an object of class complex.
  • Postcondition: real() == re && imag() == im.

[...]

Given void f(std::complex<float> arg){}:

  • It is a default constructor, since it can be invoked with no arguments and default arguments will be used for both the real and imaginary part. e.g., f(std::complex<float>{});.
  • It is a converting constructor, since it is not explicit.
    • It is a converting constructor from T, since it can be invoked with one argument and the default argument will be used for the imaginary part. e.g., f(1.f);,
    • It is a converting constructor from a braced-init-list of two elements —or zero or one, given the default arguments—. e.g., f({1.f, 0.f}).

explicit constructors

There are no such things as explicit converting constructors, because there is no need for them to be. The language behaves exactly as one would expect them to if they existed —that is, they are only invoked when explicitly requested—:

[12.3.1/2] An explicit constructor constructs objects just like non-explicit constructors, but does so only where the direct-initialization syntax (8.5) or where casts (5.2.9, 5.4) are explicitly used. A default constructor may be an explicit constructor; such a constructor will be used to perform default-initialization or value-initialization (8.5). [ Example:

struct Z {
    explicit Z();
    explicit Z(int);
    explicit Z(int, int);
  };
  
  Z a; // OK: default-initialization performed
  Z a1 = 1; // error: no implicit conversion
  Z a3 = Z(1); // OK: direct initialization syntax used
  Z a2(1); // OK: direct initialization syntax used
  Z* p = new Z(1); // OK: direct initialization syntax used
  Z a4 = (Z)1; // OK: explicit cast used
  Z a5 = static_cast<Z>(1); // OK: explicit cast used
  Z a6 = { 3, 4 }; // error: no implicit conversion

—end example ]

A note on list-initialization

List-initialization can be used in several places where an object is expected —notably as function arguments and in return statements—. There is nothing special nor new about that, it is just another context in which implicit conversions can occur. There is a lot that is special about it, but that's subject for another tale...

Conversion Functions

A conversion function is a special member function named operator T where T is a type and that has no parameters other than the implicit object parameter. There is no return type in the declaration, as the return type is the type to which we are converting to, namely T. This leaves function types and array types out, since they cannot be returned from a function —although references to those types can—.

[12.3.2/1] A member function of a class X having no parameters with a name of the form

conversion-function-id:
  operator conversion-type-id

conversion-type-id:
  type-specifier-seq conversion-declarator-opt

conversion-declarator:
  ptr-operator conversion-declarator-opt

specifies a conversion from X to the type specified by the conversion-type-id. Such functions are called conversion functions. No return type can be specified. If a conversion function is a member function, the type of the conversion function (8.3.5) is “function taking no parameter returning conversion-type-id”. A conversion function is never used to convert a (possibly cv-qualified) object to the (possibly cv-qualified) same object type (or a reference to it), to a (possibly cv-qualified) base class of that type (or a reference to it), or to (possibly cv-qualified) void. [ Example:

struct X {
    operator int();
  };
  
  void f(X a) {
    int i = int(a);
    i = (int)a;
    i = a;
  }

In all three cases the value assigned will be converted by X::operator int(). —end example ]

Some examples from the standard library are:

explicit conversion functions

An explicit conversion function is a conversion function that is declared using the explicit specifier.

[12.3.2/2] A conversion function may be explicit (7.1.2), in which case it is only considered as a user-defined conversion for direct-initialization (8.5). Otherwise, user-defined conversions are not restricted to use in assignments and initializations. [ Example:

class Y { };
  struct Z {
    explicit operator Y() const;
  };
  
  void h(Z z) {
    Y y1(z); // OK: direct-initialization
    Y y2 = z; // ill-formed: copy-initialization
    Y y3 = (Y)z; // OK: cast notation
  }
  
  void g(X a, X b) {
    int i = (a) ? 1+a : 0;
    int j = (a&&b) ? a+b : i;
    if (a) {
    }
  }

—end example ]

An explicit conversion function is not considered in a context where implicit conversions happens, thus restricting the conversion from participating in unwanted expressions. Consider a conversion from a user-defined class X to bool. Given that bool is an integral type, if said conversion function is not explicit then suddenly a value of type X could participate in all kinds of integral expressions. While it may make sense for an object representing a file to convert to false when said file is not open, it is utterly senseless to add files or to compare files using relational operators.

It may seem at first that while explicit conversion functions solve the issue, they incidentally impose additional verbosity in achieving the intended semantics —for instance, by requiring one to write if (static_cast<bool>(e)) instead of the more idiomatic if (e)—. However, the notion of contextual conversions was introduced so that certain statements behave as if the conversion were explicitly requested.

[4/4] Certain language constructs require that an expression be converted to a Boolean value. An expression e appearing in such a context is said to be contextually converted to bool and is well-formed if and only if the declaration bool t(e); is well-formed, for some invented temporary variable t(8.5).

[4/5] Certain language constructs require conversion to a value having one of a specified set of types appropriate to the construct. An expression e of class type E appearing in such a context is said to be contextually implicitly converted to a specified type T and is well-formed if and only if e can be implicitly converted to a type T that is determined as follows: E is searched for conversion functions whose return type is cv T or reference to cv T such that T is allowed by the context. There shall be exactly one such T.

A note on bool

Before explicit conversion functions where introduced in C++11, conversions to bool presented a real problem —as exemplified earlier—. Components of the IOStreams subset of the standard library initially tackled this problem by defining a conversion to void* instead, for which there is a standard conversion to bool. This prevents its participation in arithmetic expressions, but it opens the door to nonsense code such as delete std::cout. A solution using member pointers was then born, which conveniently cannot be the operand in a delete expression, and it is known as the —now obsolete— safe bool idiom.

Summary

User-defined types in C++ can specify user-defined conversions akin to the standard conversions for fundamental types.

  • Conversion operations are intrusive to either the source type or the target type of the conversion.
  • Converting constructors specify a conversion from a different type, conversion functions specify a conversion to a different type.
  • Every constructor is a converting constructor unless defined explicit.
  • Conversion functions defined explicit participate in certain language constructs, where they are known as contextual conversions.

References: