Template Magic

Filed Under Everything |

Well, what I’m going to talk about today isn’t really magic, but it does illustrate some things you can do with C++ templates that, while certainly not uncommon, may not come immediately to mind. We’re going to use templates to implement some code for dealing with byte-swapping (endianness issues). I originally wrote this content as part of an upcoming post, but that post was already getting to be several pages long. So, I’ve pulled the info out into this post, and will finish that one soon. For quite a long time, I used a simple, but ugly, set of byte swapping functions I had written early in my C++ programming experience. It was filled with a million little functions like:

  1. int hostToLittle(int);
  2. unsigned int hostToLittle(unsigned int);
  3. int hostToBig(int);
  4. unsigned int hostToBig(unsigned int);

and on and on and on. Each function body has a preprocessor switch, determining whether it should actually swap the bytes or not. It was big, it was nasty, and it was a pain to maintain. But, it worked.

Now, I’m planning to put my code out for the whole wide world to see, and while maybe there’s only one or two of you who will actually look at it, I sure didn’t want to put that mess out there. So, I set out to clean it up.

The first step was simple, I just moved the actual byte swapping code out of the individual functions for each data type. The result was three new functions, one each for 2 bytes, 4 bytes, and 8 bytes. That at least opened up the possibility of creating platform specific implementations (such as using lwbrx on ppc, for instance). There’s still a ton of code, though, and lots of preprocessor conditionals.

So, I decided to consider using templates to simplify the code further. For some strange reason, my first thought was not to templatize all of those functions for each different supported data type. I know that is probably the more obvious first step, but it simply didn’t occur to me. Instead, I thought I could use templates to get rid of most of those preprocessor conditionals.

I use boost’s endian.hpp header to determine the host’s endianness. This header provides a handful of preprocessor macros that tell can help you determine the endianness of your host. The first step was to get this information out of the hands of the preprocessor, and give it to the compiler. That was easy:

  1. enum Endianness
  2. {
  3.    BigEndian,
  4.    LittleEndian,
  5. #if defined(BOOST_BIG_ENDIAN)
  6.    HostEndianness = BigEndian
  7. #elif defined(BOOST_LITTLE_ENDIAN)
  8.    HostEndianness = LittleEndian
  9. #else
  10. #error BeeLib only supports big and little endian systems.
  11. #endif
  12. };

You may notice that error condition and wonder “what the heck is that doing there, aren’t all machines big or little endian?” The truth is, any machine this code is likely to run on is, in fact, big or little endian. But there also exists a class of CPUs that are middle-endian, and perhaps even a few other oddities. While I doubt anyone will ever try to build this on something so exotic, if they do, this will show them exactly where and what the error is, instead of in some other part of code where they try to use HostEndianness.

Initially, I wrote a template class, with all of those same functions from before as static members, something like this:

  1. template<Endianness in, Endianness out>
  2. class ByteSwapper
  3. {
  4.    static int swapBytes(int);
  5.    static unsigned in swapBytes(unsigned int);
  6.    …
  7. };

and then template specialization for each of the four possible combinations of endiannesses. That’s right, four implementations of each of those functions:

  1. template<> int swapBytes<BigEndian, BigEndian>(int in) { return in; }
  2. template<> int swapBytes<LittleEndian, LittleEndian>(int in) { return in; }
  3. template<> int swapBytes<BigEndian, LittleEndian>(int in) { return swapBytes4(in); }
  4. template<> int swapBytes<LittleEndian, BigEndian>(int in) { return swapBytes4(in); }

Ick! That is, I think, worse than the original! Surely we can do better than that!

Of course we can! The next iteration let me cut that list of functions in half, by adding a template parameter to decide if we should swap bytes or not. I never checked this version into source control, so this is from memory, and it almost certainly has some mistakes, but it looked something like this:

  1. template<Endianness in, Endianness out, bool shouldSwap = in == out>
  2. class ByteSwapper
  3. template<> int swapBytes<in, out, true>(int in) { return swapBytes4(in); }
  4. template<> int swapBytes<in, out, false>(int in) { return in; }

Now that’s not so bad, but, you guessed it, we can still do better! What’s stopping us, for instance, from templatizing on the type of data we’re being asked to swap, so we only need one pair of swap functions for each data size, instead of for each data type? As it turns out, nothing’s stopping us!

First things first, I made the actual byte swapping routines template functions. This reduced the complexity of the calling code, since there was no more need to do any casts or type punning:

  1. template<class T> T swapBytes2(T in)
  2. {
  3.    BOOST_STATIC_ASSERT(sizeof(T) == 2); // Make sure this is actually a 2-byte data type
  4.    // do some byte swapping here
  5.    return result;
  6. }

So, at this point, our goal is to come up with a single function that, from the caller’s perspective, looks like this:

  1. int swapped = swapBytes<LittleEndian, HostEndianness>(unswapped);

The caller shouldn’t need to do anything in the way of telling the function how big the data to be swapped is, or whether the data actually needs to be swapped or not. But to function, our swap function needs to know all of that stuff. So, we’ll shield the caller from all of these details by breaking things out into a public interface (the swapBytes function) and an implementation, which will go into a template struct called swapBytes_impl. Since you can’t specialize template functions, this will also allow us to use specialization, but still keep the main entry point a naked function, without the added scoping of a struct or class.

The base implementation looks like this:

  1. template<bool, size_t>
  2. struct swapBytes_impl
  3. {
  4.    template <class T>
  5.    static T swapBytes(T in) { return in; }
  6. };

And the swapBytes function looks like this:

  1. template<Endianness in_endian, Endianness out_endian, class T>
  2. T swapBytes(T in)
  3. {
  4.    // make sure we can actually handle this sized data
  5.    BOOST_STATIC_ASSERT(sizeof(T) == 2 || sizeof(T) == 4 || sizeof(T) == 8);
  6.    // Don’t try to swap anything except built-in arithmetic types (integers & floats)
  7.    // If you, for some reason, you think your data should be swapped as-is, then you’ll
  8.    // need to cast it to the appropriately sized type yourself. You should also reconsider
  9.    // your design, because it\’s very possibly wrong. Keep in mind, also, that swapping a
  10.    // pointer makes very little sense.
  11.    BOOST_STATIC_ASSERT(boost::is_arithmetic<T>::value);
  12.  
  13.    return swapBytes_impl<in_endian == out_endian, sizeof(T)>::swapBytes(in);
  14. }

Sweet. So now we have a function that can handle not swapping bytes for integers, floats, etc etc etc. Now, we just create some specializations of the implementation that actually does swapping, one for each data size we can handle:

  1. template<>
  2. struct swapBytes_impl<false, 4>
  3. {
  4.    template <class T>
  5.    static T swapBytes(T in)
  6.    {
  7.       return swapBytes4(in);
  8.    }
  9. };

(the rest of these functions look exactly the same, except with the 4’s replaced by 2’s and 8’s)

And that’s it, we’re done! So, what happens when we make a call to this byte swapping function? Let’s start with the simple case:

  1. int swapped = swapBytes<LittleEndian, LittleEndian)(unswapped);

The C++ compiler automatically fills in the last template parameter, T, with the type of the data being passed in, in this case “int”. So, the generated function would look something like:

  1. int swapBytes(int in)
  2. {
  3.    return swapBytes_impl<true, 4>::swapBytes(in);
  4. }

Since we don’t specialize for cases where the bool template parameter is true, this will just expand to the default implementation:

  1. int swapBytes_impl::swapBytes(int in)
  2. {
  3.    return in;
  4. }

This is what happens with the case where we actually need to some swapping. Here’s the call:

  1. int swapped = swapBytes<LittleEndian, LittleEndian)(unswapped);

The template expansion:

  1. int swapBytes(int in)
  2. {
  3.    return swapBytes_impl<false, 4>::swapBytes(in);
  4. }

This time, the bool template parameter is false, and the second parameter is 4. Lo and behold, we have a template specialization for this case! Here’s the expansion:

  1. int swapBytes_impl::swapBytes(int in)
  2. {
  3.    return swapBytes4(in);
  4. }

Beautiful! So, with templates, we’ve managed to reduce this code from a thousand lines or so down to around 200. Would you rather maintain 1000 lines or 200 lines? I thought so. Not to mention, we’ve reduced the number of actual function bodies from some ungodly number to 8. Less code == less errors. You can take a look at the final code here.

Can this be improved further? Probably, but it’s good enough for now. Of course, if anyone has any improvements to suggest, I am open to hear ‘em!


Comments

Name

Speak your mind

*
To prove that you're not a bot, enter this code
Anti-Spam Image

Check Spelling
Activate Spell Check while Typing

3 Comments so far

  1. mojmir on February 16, 2008 9:45 am

    this looks good, but unfortunately the link to final code does not work… can you fix it, please?

    many thanks!

  2. mojmir on February 19, 2008 4:14 am
  3. Andy Molloy on April 24, 2008 7:57 am

    I apologize for the long delay in responding to this, things have been quite hectic with the new baby and all. At any rate, the link is now fixed.