« oldest ‹ previous | next › newest »

Using CRTP to easily hijack operators in c++11

In this post I will show a neat trick for hijacking operators in c++ without writing tons of repeating code. As an example, I will hijack the output operator <<, which is used to place objects into a stream. This allows for construction such as cout << multi('=', 80) << endl; to fill a standard console line with equality signs. The hijacking will allow me to write a very small class that focuses on the core task: to hijack the stream output iterator and put 80 equality signs into it.

To allow the multiplier object to focus on its core task, I will employ the curiously recurring template patten (CRTP) with an empty base-class. Since we are hijacking the output operator, which is left-associative, I will call this base class an ltor_insert. It is, after all, inserted in a left-to-right associative operator chain.

template<typename insert_T> struct ltor_insert {};

It is an empty structure that takes some other class as a template parameter. The template parameter is what causes it to be a CRTP base-class. The curiosity of the pattern shows in the derived class, in this example, the class that will determine what happens when the output stream is hijacked. I call it multi_t, because it is a type that will be returned when the factory function multi is called.

template<typename what_T>
class multi_t : public ltor_insert<multi_t<what_T> > {
};

As you can see, it inherits from ltor_insert, which makes it a derived class, but it derives from an instance of ltor_insert that is parameterized with itself! This is the curious part of the curiously recurring template pattern. It works by making ltor_insert aware of its derived classes, which we will exploit in the hijacking.

It also takes the thing-to-be-printed as a templated type (what_T), allowing you to switch freely between eg. chars and strings. The task of the object is to remember what needs to be printed and how many times, so it will need members to store that information (don’t worry, since it goes out of scope so fast, this will all be optimized away by the compiler), and a constructor to populate them:

template<typename what_T>
class multi_t : public ltor_insert<multi_t<what_T> > {
public:
  inline multi_t(what_T&& what, uint32_t times)
    : what_m(std::forward<what_T>(what)), times_m(times)
  {}
private:
  what_T what_m;
  uint32_t times_m;
};

Astute readers will notice that this class potentially stores references, which is generally a very bad idea (don’t do that if you can avoid it!). In this case, there are redeeming qualities: the only place where the what_m member is manipulated is in the constructor, and the benefit of not having to copy things passed in as references into the object makes it a reasonable trade-off.

To code the behavior of the object, we will make it into a unary functor by implementing the ()-operator. Since we are hijacking an output stream, the functor will take a reference to an std::ostream as its sole parameter and return type. The implementation just puts times_m instances of what_m into the stream passed to the function, and then returns it. The final class looks like this:

template<typename what_T>
class multi_t : public ltor_insert<multi_t<what_T> > {
public:
  inline multi_t(what_T&& what, uint32_t times)
    : what_m(std::forward<what_T>(what)), times_m(times)
  {}
  inline std::ostream& operator()(std::ostream& s) const {
    for (uint32_t i = 0; i < times_m; ++i) {
      s << what_m;
    }
    return s;
  }
private:
  what_T what_m;
  uint32_t times_m;
};

To construct an instance of multi_t, we have to call:

multi_t<std::string>(std::string("="), 80);

This becomes tedious surprisingly quickly, and doesn’t do much for readability or ease of use. Instead, I like to have a convenience function that derives the correct parameters to multi_t by what was actually passed into it:

template<typename what_T>
inline multi_t<what_T> multi(what_T&& what, uint32_t times) {
  return multi_t<what_T>(std::forward<what_T>(what), times);
}

This allows multi to be called with any first parameter, and the convenience function and the corresponding return type will be instantiated when needed. Also note the use of perfect forwarding to guarantee that the meta function does not interfere by causing unwanted copying of the thing being multiplied.

So far, everything looks pretty straight-forward (except for the CRTP stuff, which you say will be useful right about now (if you want me to keep reading; yes, yes, I now, getting to it!)). The remaining part is the “special sauce” that ties the whole dish together. But, lets start in the beginning. What we want is to have an << operator for std::ostream which defines what happens when a multi_t is inserted into the stream. We would typically write something like this:

template<typename what_T>
inline std::ostream&
operator<<(std::ostream& s, const multi_t<what_T>& x) {
  return x(s);
}

(Well, actually, we would probably have put the implementation of multi_t::operator() straight into the implementation above). Now, lets see how we can generalize this.

First, we can replace the explicit multi_t with “anything that derives from ltor_insert. This is where the CRTP comes into play, as it allows us to statically cast the ltor_insert into its derived class and reach the functionality implemented there!

template<typename insert_T>
inline std::ostream&
operator<<(std::ostream& s, const ltor_insert<insert_T>& x) {
  return static_cast<const insert_T&>(x)(s);
}

Second, we can replace the stream with a template parameter, after all, it makes little sense to only be able to hijack an output stream.

template<typename base_T, typename insert_T>
inline base_T&
operator<<(base_T& base, const ltor_insert<insert_T>& x) {
  return static_cast<const insert_T&>(x)(base);
}

Third, we can take the base by universal reference. This works because of the reference collapsing rules in c++11, which we can use to ensure perfect forwarding, that is: the base is passed into the hijacking functor exactly as it was passed into the <<-operator:

template<typename base_T, typename insert_T>
inline base_T&&
operator<<(base_T&& base, const ltor_insert<insert_T>& x) {
  return static_cast<const insert_T&>(x)(std::forward<base_T>(base));
}

Fourth, we can infer the return type from the inserted functor. That is, instead of insisting that a hijacking functor returns the same type that it is passed, we will allow the operator to result in whatever-the-hijacking-functor-results-in. C++11 provides the auto return type and the decltype keyword for this purpose:

template<typename base_T, typename insert_T>
inline auto
operator<<(base_T&& base, const ltor_insert<insert_T>& x) ->
decltype(static_cast<const insert_T&>(x)(std::forward<base_T>(base))) {
  return static_cast<const insert_T&>(x)(std::forward<base_T>(base));
}

The return type is specified to be the keyword auto, which allows you to use the parameters passed in to the function to infer what the actual return type will be. This needs to be after the parameter list, because it depends on it. After the parameter list, add an arrow ->, and what the inferred type should be. In this case, we use the expression we want to return, and specify that the return type is whatever-the-expression-is-declared-to-result-in, which is what decltype gives us.

With all the new degrees of freedom, we might want to secure up against the operator being called with a base ltor_insert. One way to do this is to build a reasonable default behaviour into ltor_insert, such as just returning whatever is passed in:

template<typename insert_T> struct ltor_insert {
  template<typename base_T>
  inline base_T&& operator()(base_T&& base) const { return base; }
};

Finally, we want to be able to handle different ways of passing in the hijacker. We have the constant reference version, but we also need the non-constant reference, and the r-value reference (which replaces call-by-value in c++11):

template<typename base_T, typename insert_T>
inline auto
operator<<(base_T&& base, ltor_insert<insert_T>& x) ->
decltype(static_cast<insert_T&>(x)(std::forward<base_T>(base))) {
  return static_cast<insert_T&>(x)(std::forward<base_T>(base));
}

template<typename base_T, typename insert_T>
inline auto
operator<<(base_T&& base, ltor_insert<insert_T>&& x) ->
decltype(static_cast<insert_T&&>(x)(std::forward<base_T>(base))) {
  return static_cast<insert_T&&>(x)(std::forward<base_T>(base));
}

And that’s it! To fill a console line with 80 equality signs, we can write:

cout << multi('=', 80) << endl;

Which will expand into:

cout << multi_t<char>('=', 80) << endl;

Which will expand into:

operator<<(cout, multi_t<char>('=', 80)) << endl;

Which will expand into:

static_cast<multi_t<char>&&>(ltor_insert<multi_t<char>&&>(x)(cout)) << endl;

(where x is a multi_t keeping track of “80 equality signs”.) Which is really just calling:

x.operator()(cout) << endl;

Which puts 80 equality signs into cout and returns cout, which means that the expression reduces to:

cout << endl;

After 80 equality signs have been put into cout. Like a boss.

2014–01–22 | c++, c++11, template meta programming, crtp