20 Feb, 2014

Rant: On the Templated Nature of std::chrono

There is a trending misconception about the templated nature of std::chrono. The claim comes in different forms, but it always boils down to std::chrono::duration and std::chrono::time_point being templates and thus impossible to use together with virtual functions or across a binary interface. The claim seems to be based on the assumption that using the templated features of std::chrono require templates all the way down, which is blatantly wrong...

Poor Choices

The defective interface presented as a workaround often looks like this:

// declarations in a header file, definitions in a source file
void wait_until(std::chrono::steady_clock::time_point const& abs_time);
void wait_for(std::chrono::steady_clock::duration const& rel_time);

The mere existence of std::this_thread::sleep_for and std::this_thread::sleep_until should be a strong hint that this design is misguided. After all, it's hard to imagine existing threading APIs being templatized and shipped as header-only libraries just to support this.

Timing Specifications

The standard threading support library defines a set of timing specifications:

[30.2.4/1] Several functions described in this Clause take an argument to specify a timeout. These timeouts are specified as either a duration or a time_point type as specified in 20.12.

[30.2.4/2] Implementations necessarily have some delay in returning from a timeout. Any overhead in interrupt response, function return, and scheduling induces a "quality of implementation" delay, expressed as duration Di. Ideally, this delay would be zero. Further, any contention for processor and memory resources induces a "quality of management" delay, expressed as duration Dm. The delay durations may vary from timeout to timeout, but in all cases shorter is better.

An implementation is allowed to return at any point in time after the timeout is met. The observable difference is due to quality of implementation and quality of management delays.

[30.2.4/3] The member functions whose names end in _for take an argument that specifies a duration. These functions produce relative timeouts. Implementations should use a steady clock to measure time for these functions. Given a duration argument Dt, the real-time duration of the timeout is Dt + Di + Dm.

The clock used to measure relative timeouts is not specified; it can be any clock, not necessarily a steady one —the use of should indicates a suggestion, not a requirement—. This allows an implementation of sleep_for to forward the call to the underlying threading API after converting the duration to the appropriate one:

// defined in a different translation unit/library
void sleep_for(_Underlying_threading_api_duration const& rel_time);

template<typename Rep, typename Period>
void sleep_for(std::chrono::duration<Rep, Period> const& rel_time) {
  sleep_for(duration_cast<_Underlying_threading_api_duration>(rel_time));
}

If the relative timeout is not exactly convertible to the underlying duration, the implementation must round it up. An implementation is already allowed to wait a little longer, and sometimes it has to do it since it's not allowed to return early.

[30.2.4/4] The member functions whose names end in _until take an argument that specifies a time point. These functions produce absolute timeouts. Implementations should use the clock specified in the time point to measure time for these functions. Given a clock time point argument Ct, the clock time point of the return from timeout should be Ct + Di + Dm when the clock is not adjusted during the timeout. If the clock is adjusted to the time Ca during the timeout, the behavior should be as follows:

  • if Ca > Ct, the waiting function should wake as soon as possible, i.e. Ca + Di + Dm, since the timeout is already satisfied. [ Note: This specification may result in the total duration of the wait decreasing when measured against a steady clock. —end note ]

  • if Ca <= Ct, the waiting function should not time out until Clock::now() returns a time Cn >= Ct, i.e. waking at Ct + Di + Dm. [ Note: When the clock is adjusted backwards, this specification may result in the total duration of the wait increasing when measured against a steady clock. When the clock is adjusted forwards, this specification may result in the total duration of the wait decreasing when measured against a steady clock. —end note ]

An implementation shall return from such a timeout at any point from the time specified above to the time it would return from a steady-clock relative timeout on the difference between Ct and the time point of the call to the _until function. [ Note: Implementations should decrease the duration of the wait when the clock is adjusted forwards. —end note ]

The clock used to measure absolute timeouts is not specified either; an implementation may choose to ignore the given Clock and use a steady one instead. For an implementation of sleep_until to acknowledge adjustments on a clock it is unaware of, it would have to constantly poll the value of Clock::now() —which is a rather busy sleep pattern—. Alternatively, an implementation can sleep for abs_time - Clock::now() as long as it measures it on a steady clock —the use of shall indicates a requirement—:

template<typename Clock, typename Duration>
void sleep_until(std::chrono::time_point<Clock, Duration> const& abs_time) {
  do {
    sleep_for(abs_time - Clock::now()); // assuming sleep_for uses a steady clock
  } while (Clock::now() < abs_time)
}

While this is enough to be conformant, it doesn't even try to acknowledge forward adjustments to the clock. A better implementation would define a maximum sleep time, so that every so often the provided clock is queried again and the relative timeout is recalculated.

Finally, the standard reads on a footnote:

All implementations for which standard time units are meaningful must necessarily have a steady clock within their hardware implementation.

At some point during the standardization process steady_clock was not provided, a monotonic_clock with weaker guarantees was provided instead and it was conditionally supported. N3191 noted that the timeout definition necessarily depends on a steady clock, and that a monotonic clock is of little benefit given a steady one.

Putting It All Together

The timing specifications do not provide any real-time guarantees, all they guarantee is that after a successful return the timeout has been observed. In the face of clock adjustments, this observation might no longer hold by the time the user regains control. Insulation from clock adjustments is a valuable property to build on top of.

A conformant implementation always provides a steady_clock —one that cannot be adjusted—. While a call in the form xxx_for(rel_time) will not necessarily use a steady clock, the seemingly equivalent call xxx_until(std::chrono::steady_clock::now() + rel_time) is guaranteed to do so. A curious exception is that of std::condition_variable and std::condition_variable_any, for which wait_for is defined in terms of wait_until and steady_clock —a quick search of the standard does not show any other perpetrators—.

Clock adjustments are often a source of surprises, whether they are ignored or acknowledged: an alarm clock set to midnight should better fire at midnight, even when the user moves to a different time zone in between; while a timeout of ten seconds for an operation to complete should wait for only ten seconds and not an hour and ten seconds. The decision on how to deal with clock adjustments may pertain to the application domain, or ultimately to the end user.

Ignoring clock adjustments

Durations are clock agnostic, and thus they are unaffected by clock adjustments when measured against a steady clock. The building blocks for this approach are std::chrono::nanoseconds —a duration with enough resolution and capable of representing up to 292 years— and std::chrono::steady_clock:

// defined elsewhere
void wait_for(std::chrono::nanoseconds const& rel_time);

template<typename Rep, typename Period>
void wait_for(std::chrono::duration<Rep, Period> const& rel_time) {
  auto d = std::chrono::duration_cast<std::chrono::nanoseconds>(rel_time);
  if (d < rel_time) ++d; // round up
  return wait_for(d);
}

template<typename Clock, typename Duration>
void wait_until(std::chrono::time_point<Clock, Duration> const& abs_time) {
  return wait_for(abs_time - Clock::now());
}

// --- elsewhere ---
void wait_for(std::chrono::nanoseconds const& rel_time) {
  std::this_thread::sleep_until(std::chrono::steady_clock::now() + rel_time);
  /* possibly do something now... */
}

The entire functionality is implemented on top of a single non-templated wait_for function —possibly virtual—. This function uses xxx_until together with steady_clock internally in order to guarantee clock adjustments have no effect. Since clock adjustments are ignored or non-existent, it is possible to represent absolute timeouts as relative timeouts instead.

This implementation can be provided after the fact around a poorly designed interface, as long as said interface advertises using a steady clock.

Acknowledging clock adjustments

Unsurprisingly, acknowledging clock adjustments requires dealing with clocks and hence with time points. However, this approach chooses to represent time points as the time since the clock's epoch —a duration, in std::chrono::nanoseconds—. The current time point of the clock is also represented in this way, with help from some trivial type erasure:

// Clock::now() time since epoch in the Duration given
template <typename Clock, typename Duration>
Duration time_since_epoch()
{
  auto now = Clock::now().time_since_epoch();
  auto d = std::chrono::duration_cast<Duration>(now);
  if (d > now) --d; // round down
  return d;
}

// defined elsewhere
void wait_until(std::chrono::nanoseconds const& abs_time, std::chrono::nanoseconds (*clock_now)());

template<typename Clock, typename Duration>
void wait_until(std::chrono::time_point<Clock, Duration> const& abs_time)
{
  auto tp = abs_time.time_since_epoch();
  auto d = std::chrono::duration_cast<std::chrono::nanoseconds>(tp);
  if (d < tp) ++d; // round up
  return wait_until(d, &time_since_epoch<Clock, std::chrono::nanoseconds>);
}

template<typename Rep, typename Period>
void wait_for(std::chrono::duration<Rep, Period> const& rel_time) {
  return wait_until(std::chrono::steady_clock::now() + rel_time);
}

// --- elsewhere ---
void wait_until(std::chrono::nanoseconds const& abs_time, std::chrono::nanoseconds (*clock_now)())
{
  std::chrono::nanoseconds const threshold = std::chrono::seconds{1};
  do {
    auto rel_time = std::min(abs_time - clock_now(), threshold);
    std::this_thread::sleep_until(std::chrono::steady_clock::now() + rel_time);
  } while (clock_now() < abs_time);
  /* possibly do something now... */
}

The entire functionality is implemented on top of a single non-templated wait_until function —possibly virtual—. This function uses xxx_until together with steady_clock internally in order to avoid unexpected clock adjustments. The provided clock is polled every so often to check for forward clock adjustments, and again at the end when the timeout is believed met to check for backwards clock adjustments.

Poor Choices, Uncovered

The templated nature of std::chrono is designed to give users expressiveness when dealing with time points and durations. In the absence of clock adjustments, all possible ways of expressing a timeout represent the same single underlying value —akin to a units library for timeouts—. Clock adjustments are tricky on their own, to the point that honoring them is not required by the standard.

A defective interface like the one presented earlier, designed only in terms of std::chrono::steady_clock, is being repetitive and lazy at the same time. The single fundamental operation is exposed in terms of not one but two timeout units, while the burden of converting from all other possible ways of expressing them is left to the user.