People who are interested in game programming typically have one or two specific areas that they consider The Fun Stuff. For many, it’s the graphics stuff, trying to figure out how to push more polys, with more detail while maintaining a reasonable framerate. For others, it’s AI, building smarter, nastier opponents for the player to face. There might even be a handful who love nothing more than hacking low level network code, so even a player on a gimpy dial-up connection gets to enjoy the action. But I don’t believe I’ve ever encountered anyone who’s said “I want to get into game programming because I just love figuring out how to serialize data to disk!” or “If only I could get a job dealing with localization code in video games!” Maybe there are a few of you out there, but for most of us, this sort of thing falls under the category of The Un-Fun Stuff. Unfortunately, the Un-Fun Stuff still needs to be written, and is still a vital part of developing a production quality game engine. So, I’ve decided I might as well get some of these items done and out of the way. In particular, I’ll be tackling data serialization.

If you’re not familiar with the concept of data serialization, the basic idea is to take a data structure and put it into a flattened form that can then be written to disk, sent across a network, or copied to another chunk of memory. Once a data structure has been serialized, you should be able to turn around and deserialize it, and get the same data structure. Sounds easy initially, but it turns out there are quite a few complications. Perhaps the biggest complications come from dealing with pointers and references. When you deserialize the data structure, you have no guarantee - indeed, no expectation - that pieces of the structure will be in the same locations in memory. Generally speaking, this problem is solved by keeping track of every pointer in the structure, what they’re supposed to point to, and their actual locations in the new data structure. Then, once every item to be deserialized is built up, you walk through all of these pointers and fix them up (usually referred to as linking) to point to the new, correct memory locations.

Now, this is probably a good time to mention that I am lazy. If I can avoid writing code, I usually do.

With all of that said, I’m going to pull a fast one on you. We’re not going to implement serialization today. No sir! You see, the Boost library already contains pretty nice support for data serialization. It does very nearly everything I want, it’s got a pretty nice interface, and boy-oh-boy does it save a lot of work. That’s not to say it’s a free ride, but certainly a lot better than starting from scratch. There’s plenty of great documentation for it on the Boost web site. A very quick summary: you’ll have to write some fairly simple functions to serialize your data. The serialization library already knows how to handle PODs, pointers, and references. So, an example might look like this:

  1. class foo
  2. {
  3.    public:
  4.       foo() : a(0), b(‘a’) {}
  5.    private:
  6.       int a;
  7.       char b;
  8.  
  9.       friend class boost::serialization::access;
  10.       template<class Archive>
  11.       void serialize(Archive & ar, const unsigned int version)
  12.       {
  13.          ar & a;
  14.          ar & b;
  15.       }
  16. };

The first thing you might notice is the strange use of the & operator in function serialize. This operator is equivelent to >> when serializing data, and << when deserializing data. In simple cases such as this, this allows you to write a single function to handle both operations. Of course, not all situations are this simple, but there are plenty of tools in the serialization library to allow you to handle the more complicated cases.

One thing the Boost serialization library does not have out of the box is a portable binary archive format. There is an example in the documentation that can deal with different endianness of integers, and has the added benefit that it does some very minor compression of integral data. It does not, however, deal with floating point numbers. Also, the endianness of the archive data is hard coded to little endian. So, even if your primary platform is big endian, you’ll either have to change the archive code itself or always pay for byte swaps, even though they’re unnecessary. So, I’ve decided to create a custom portable binary archive format.

I should say here that, by portable, what I really mean is portable between most common desktop platforms. Right now, that means to me x86 and PPC. So we will only need to handle swapping bytes between little and big endian formats. Both platforms use IEEE 754 floating point representation, so floats and doubles will also require nothing more than a byte swap.

Fortunately, Boost Serialization makes it fairly easy to create your own archive format, and the process of doing so is well documented. I’ll just talk about the archive output code here; the input code is nearly identical and, as always, can be found in the Programmicon SVN Repository.

First of all, I’ll use the endianness utilities I talked about before to handle all of the byte swapping. Also, the archive itself will have a template parameter to specify the endianness of the archive. The Boost Serialization documentation is pretty good about explaining how to start creating your own archive format, so I won’t repeat all of that. Instead, I’ll just focus on the specifics of our portable format.

Since our archive will be a binary format, we’ll start by subclassing boost::archive::binaryoarchiveimpl. I haven’t bothered to look into the actual code for binaryoarchiveimpl, but the way we have to inherit from it makes it pretty clear it uses the Curiously Recurring Template Pattern:

  1. template<Endianness dataEndianness>
  2. class portable_oarchive :
  3.     public boost::archive::binary_oarchive_impl<portable_oarchive<dataEndianness> >

In order to handle swapping our bytes, we will need to provide a set of save() functions, overridden for each data type we want to handle. We also need to provide a templatized version of this function to handle anything we don’t want to byte swap (such as strings and whatnot). Here’s a chunk of the code:

  1. // Here’s our fall through function. Any data type we don’t directly handle will be handled here. Not that this means anything
  2. // that comes through this function will not be byte swapped. Since we will be overridding this function for all of the types we
  3. // DO want byte swapped, that will be just fine.
  4. template<class T>
  5. void save(const T & t)
  6. {
  7.    boost::archive::binary_oarchive_impl<derived_t>::save(t);
  8. }
  9.  
  10. // Here’s an example of a data type that we DO want to byte swap.  We’ll need a function like this for every type we swap.
  11. void save(int16 t)
  12. {
  13.    t = swapBytes<dataEndianness, HostEndianness>(t);
  14.    this->save_binary(&t, sizeof(int16));
  15. }

The “default” save function is pretty simple, just passing its responsibility on to the base class. The override for the int16 data type isn’t much more complicated. We simply swap bytes using one of the endianness tools I presented earlier. This is effectively a no-op if dataEndianness and HostEndianness are the same. After that, we pass the data on to a function that comes from binaryoarchiveimpl to actually write the data to the archive. And, that’s it! We just provide similar functions for each data type we want to handle, specifically int16, uint16, int32, uint32, int64, uint64, float, and double.

That’s really pretty much all there is to the portable archive format. I think the actual code for this thing is smaller than this stupid blog post! Of course, there are still all of the details of how to integrate this stuff into your code, and deciding what data actually needs to be serialized. But that is all outside of the scope of this post, which has gone on for long enough already, so I’ll leave that stuff as an exercise for the reader.


Comments

Name

Speak your mind

*
To prove that you're not a bot, enter this code
Anti-Spam Image

Check Spelling
Activate Spell Check while Typing