Online release of Data-Oriented Design : This is the free, online, reduced version. Some inessential chapters are excluded from this version, but in the spirit of this being an education resource, the essentials are present for anyone wanting to learn about data-oriented design. Expect some odd formatting and some broken images and listings as this is auto generated and the Latex to html converters available are not perfect. If the source code listing is broken, you should be able to find the referenced source on github. If you like what you read here, consider purchasing the real paper book from here, as not only will it look a lot better, but it will help keep this version online for those who cannot afford to buy it. Please send any feedback to support@dataorienteddesign.com
Subsections

What's wrong?

What's wrong with object-oriented design? Where's the harm in it?

Over the years, game developers have fallen into a style of C++ that is so unappealing to hardware that the managed languages don't seem all that much slower in comparison. The pattern of usage of C++ in game development was so appallingly mismatched to the hardware of the PlayStation 3 and Xbox 360 generation, it is no wonder an interpreted language is only in the region of 50% slower under normal use and sometimes faster^11.1 in their specialist areas. What is this strange language that has embedded itself into the minds of C++ game developers? What is it that makes the fashionable way of coding games one of the worst ways of making use of the machines we're targeting? Where, in essence, is the harm in game-development style object-oriented C++?

Some of this comes from the initial interpretation of what object-oriented means, as game developers tended to believe that object-oriented meant you had to map instances of everything you cared about into the code as instances of objects. This form of object-oriented development could be interpreted as instance-oriented development, and it puts the singular unique entity ahead of the program as a whole. When put this way, it is easier to see some of the problems that can arise. Performance of an individual is very hard to decry as poor, as object methods are hard to time accurately, and unlikely to be timed at all. When your development practices promote individual elements above the program as a whole, you will also pay the mental capacity penalty, as you have to consider all operations from the point of view of the actors, with their hidden state, not from a point of view of value semantics.

Another issue is it appears that performance has not been ignored by the language designers, but potentially instead it has been tested for quality in isolation. This could be because the real world uses of C++ are quite different from the expectation of the library providers, or it could be the library providers are working to internal metrics instead of making sure they understand their customer. It's the opinion of the author, when developing a library, or a set of templates for use in C++, it shouldn't just be possible to tune performance out of the code you are using, it should come as default. If you make it possible to tune performance, you trade features for understanding and performance. This is a poor trade for game developers, but has been accepted, as the benefit of a common language is a very tempting offer.

The harm

Claim: Virtuals don't cost much, but if you call them a lot it can add up.
aka - death by a thousand paper cuts

The overhead of a virtual call is negligible under simple inspection. Compared to what you do inside a virtual call, the extra dereference required seems petty and very likely not to have any noticeable side effects other than the cost of a dereference and the extra space taken up by the virtual table pointer. The extra dereference before getting the pointer to the function we want to call on this particular instance seems to be a trivial addition, but let's have a closer look at what is going on.

A class that has derived from a base class with virtual methods has a certain structure to it. Adding any virtual methods to a class instantly adds a virtual table to the executable, and a virtual table pointer as the implicit first data member of the class. There is very little way around this. It's allowed in the language specification for the data layout of classes to be up to the compiler to the point where they can implement such things as virtual methods by adding hidden members and generating new arrays of function pointers behind the scenes. It is possible to do this differently, but it appears most compilers implement virtual tables to store virtual method function pointers. It's important to remember virtual calls are not an operating system level concept, and they don't exist as far as the CPU is concerned, they are just an implementation detail of C++.

When we call a virtual method on a class we have to know what code to run. Normally we need to know which entry in the virtual table to access, and to do that we read the first data member in order to access the right virtual table for calling. This requires loading from the address of the class into a register and adding an offset to the loaded value. Every non-trivial virtual method call is a lookup into a table, so in the compiled code, all virtual calls are really function pointer array dereferences, which is where the offset comes in. It's the offset into the array of function pointers. Once the address of the real function pointer is generated, only then can instruction decoding begin. There are ways to not call into the virtual table, notably with C++11, there has been some progress with the final keyword that can help as classes that cannot be overridden can now know that if they call into themselves, then they can call functions directly. This doesn't help for polymorphic calls, or call sites that access the methods from the interface without knowing the concrete type (see listing ), but it can occasionally help with some idioms such as private implementation (pImpl), and the curiously recurring template pattern.

$\begin{linespread}{0.75}\lstinputlisting[language=C,caption={A simple derived class},label=src:derived]{src/HARM_basederived.cpp}\end{linespread}$

For multiple inheritance it is slightly more convoluted, but basically, it's still virtual tables, but now each function will define which class of vtable it will be referencing.

So let's count up the actual operations involved in this method call: first we have a load, then an add, then another load, then a branch. To almost all programmers this doesn't seem like a heavy cost to pay for runtime polymorphism. Four operations per call so you can throw all your game entities into one array and loop through them updating, rendering, gathering collision state, spawning off sound effects. This seems to be a good trade-off, but it was only a good trade-off when these particular instructions were cheap.

Two out of the four instructions are loads, which don't seem like they should cost much, but unless you hit a nearby cache, a load takes a long time and instructions take time to decode. The add is very cheap^11.2, to modify the register value to address the correct function pointer, but the branch is not always cheap as it doesn't know where it's going until the second load completes. This could cause an instruction cache miss. All in all, it's common to see a chunk of time wasted due to a single virtual call in any significantly large scale game. In that chunk of time, the floating point unit alone could have finished naïvely calculating lots of dot products, or a decent pile of square roots. In the best case, the virtual table pointer will already be in memory, the object type the same as last time, so the function pointer address will be the same, and therefore the function pointer will be in cache too, and in that circumstance it's likely the branch won't stall as the instructions are probably still in the cache too. But this best case is not always the common case for all types of data.

Consider the alternative, where your function ends, and you are returning some value, then calling into another function. The order of instructions is fairly well known, and to the CPU looks very similar to a straight line. There are no deviations from getting instructions based on just following the program counter along each function in turn. It's possible to guess quite far ahead the address of any new functions that will be called, as none of them are dependent on data. Even with lots of function calls, the fact they are deducible at compile time makes them easy to prefetch, and pretranslate.

The implementation of C++ doesn't like how we iterate over objects. The standard way of iterating over a set of heterogeneous objects is to literally do that, grab an iterator and call the virtual function on each object in turn. In normal game code, this will involve loading the virtual table pointer for each and every object. This causes a wait while loading the cache line, and cannot easily be avoided. Once the virtual table pointer is loaded, it can be used, with the constant offset (the index of the virtual method), to find the function pointer to call, however, due to the size of virtual functions commonly found in game development, the table won't be in the cache. Naturally, this will cause another wait for load, and once this load has finished, we can only hope the object is actually the same type as the previous element, otherwise, we will have to wait some more for the instructions to load.

Even without loads, not knowing which function will be called until the data is loaded means you rely on a cache line of information before you can be confident you are decoding the right instructions.

The reason virtual functions in games are large is that game developers have had it drilled into them that virtual functions are okay, as long as you don't use them in tight loops, which invariably leads to them being used for more architectural considerations such as hierarchies of object types, or classes of solution helpers in tree-like problem-solving systems (such as pathfinding, or behaviour trees).

Let's go over this again: many developers now believe the best way to use virtuals is to put large workloads into the body of the virtual methods, so as to mitigate the overhead of the virtual call mechanism. ^11.3 However, doing this, you can virtually guarantee not only will a large portion of the instruction and data cache be evicted by each call to update(), but most branch predictor slots may become dirty too, and fail to offer any benefit when the next update() runs. Assuming virtual calls don't add up because they are called on high-level code is fine until they become the general programming style, leading to developers failing to think about how they affect the application, ultimately leading to millions of virtual calls per second. All those inefficient calls are going to add up and impact the hardware, but they hardly ever appear on any profiles. The issue isn't that it's not there, it's that it's spread thinly over the whole of the processing of the machine. They always appear somewhere in the code being called.

Carlos Bueno's book Mature Optimization Handbook[#!MatureOpt!#], talks about how it's very easy to miss the real cause of slowness by blindly following the low hanging fruit approach. This is where the idea of creating a hypothesis can prove useful, as when it turns out to not reap the expected rewards, you can retrace and regroup faster. For Facebook, they traced what was causing evictions and optimised those functions, not for speed, but to remove as much as possible the chance that they evicted other data from the cache.

In C++, classes' virtual tables store function pointers by their class. The alternative is to have a virtual table for each function and switch function pointer on the type of the calling class. This works fine in practice and does save some of the overhead as the virtual table would be the same for all the calls in a single iteration of a group of objects. However, C++ was designed to allow for runtime linking to other libraries, libraries with new classes that may inherit from the existing codebase. The design had to allow a runtime linked class to add new virtual methods, and have them callable from the original running code. If C++ had gone with function oriented virtual tables, the language would have had to runtime patch the virtual tables whenever a new library was linked, whether at link-time for statically compiled additions, or at runtime for dynamically linked libraries. As it is, using a virtual table per class offers the same functionality but doesn't require any link-time or runtime modification to the virtual tables as the tables are oriented by the classes, which by the language design are immutable during link-time.

Combining the organisation of virtual tables and the order in which games tend to call methods, even when running through lists in a highly predictable manner, cache misses are commonplace. It's not just the implementation of classes that causes these cache misses, it's any time data is the deciding factor in which instructions are run. Games commonly implement scripting languages, and these languages are often interpreted and run on a virtual machine. However the virtual machine or JIT compiler is implemented, there is always an aspect of data controlling which instructions will be called next, and this causes branch misprediction. This is why, in general, interpreted languages are slower, they either run code based on loaded data in the case of bytecode interpreters or they compile code just in time, which though it creates faster code, causes issues of its own.

When a developer implements an object-oriented framework without using the built-in virtual functions, virtual tables and this pointers present in the C++ language, it doesn't reduce the chance of cache miss unless they use virtual tables by function rather than by class. But even when the developer has been especially careful, the very fact they are doing object-oriented programming with game developer access patterns, that of calling singular virtual functions on arrays of heterogeneous objects, they are still going to have some of the same instruction decode and cache misses as found with built-in virtuals. That is, the best they can hope for is one less data dependent CPU state change per virtual call. That still leaves the opportunity for two mispredictions.

So, with all this apparent inefficiency, what makes game developers stick with object-oriented coding practices? As game developers are frequently cited as a source of how the bleeding edge of computer software development is progressing, why have they not moved away wholesale from the problem and stopped using object-oriented development practices all together?

Mapping the problem

Claim: Objects provide a better mapping from the real world description of the problem to the final code solution.

Object-oriented design when programming in games starts with thinking about the game design in terms of entities. Each entity in the game design is given a class, such as ship, player, bullet, or score. Each object maintains its own state, communicates with other objects through methods, and provides encapsulation so when the implementation of a particular entity changes, the other objects that use it or provide it with utility do not need to change. Game developers like abstraction, because historically they have had to write games for not just one target platform, but usually at least two. In the past, it was between console manufacturers, but now game developers have to manage between Windows^{^TM} and console platforms, plus the mobile targets too. The abstractions in the past were mostly hardware access abstractions, and naturally some gameplay abstractions as well, but as the game development industry matured, we found common forms of abstractions for areas such as physics, AI, and even player control. Finding these common abstractions allowed for third party libraries, and many of these use object-oriented design as well. It's quite common for libraries to interact with the game through agents. These agent objects contain their own state data, whether hidden or publicly accessible, and provide functions by which they can be manipulated inside the constraints of the system that provided them.

The game design inspired objects (such as ship, player, level) keep hold of agents and use them to find out what's going on in their world. A player interacts with physics, input, animation, other entities, and doing this through an object-oriented API hides much of the details about what's actually required to do all these different tasks.

The entities in object-oriented design are containers for data and a place to keep all the functions that manipulate that data. Don't confuse these entities with those of entity systems, as the entities in object-oriented design are immutable of class over their lifetime. An object-oriented entity does not change class during its lifetime in C++ because there is no process by which to reconstruct a class in place in the language. As can be expected, if you don't have the right tools for the job, a good workman works around it. Game developers don't change the type of their objects at runtime, instead, they create new and destroy old in the case of a game entity that needs this functionality. But as is often the case, because the feature is not present in the language, it is underutilised even when it would make sense.

For example, in a first-person shooter, an object will be declared to represent the animating player mesh, but when the player dies, a clone would be made to represent the dead body as a rag doll. The animating player object may be made invisible and moved to their next spawn point while the dead body object with its different set of virtual functions, and different data, remains where the player died so as to let the player watch their dead body. To achieve this sleight of hand, where the dead body object sits in as a replacement for the player once they are dead, copy constructors need to be defined. When the player is spawned back into the game, the player model will be made visible again, and if they wish to, the player can go and visit their dead clone. This works remarkably well, but it is a trick that would be unnecessary if the player could become a dead rag doll rather than spawn a clone of a different type. There is an inherent danger in this too, the cloning could have bugs, and cause other issues, and also if the player dies but somehow is allowed to resurrect, then they have to find a way to convert the rag doll back into the animating player, and that is no simple feat.

Another example is in AI. The finite state machines and behaviour trees that run most game AI maintain all the data necessary for all their potential states. If an AI has three states, { Idle, Making-a-stand, Fleeing-in-terror } then it has the data for all three states. If the Making-a-stand has a scared-points accumulator for accounting their fear, so they can fight, but only up until they are too scared to continue, and the Fleeing-in-terror has a timer so they will flee, but only for a certain time, then Idle will have these two unnecessary attributes as well. In this trivial example, the AI class has three data entries, { state, how-scared, flee-time }, and only one of these data entries is used by all three states. If the AI could change type when it transitioned from state to state, then it wouldn't even need the state member, as that functionality would be covered by the virtual table pointer. The AI would only allocate space for each of the state tracking members when in the appropriate state. The best we can do in C++ is to fake it by changing the virtual table pointer by hand, dangerous but possible, or setting up a copy constructor for each possible transition.

Apart from immutable type, object-oriented development also has a philosophical problem. Consider how humans perceive objects in real life. There is always a context to every observation. The humble table, when you look at it, you may see it to be a table with four legs, made of wood and modestly polished. If so, you will see it as being a brown colour, but you will also see the reflection of the light. You will see the grain, but when you think about what colour it is, you will think of it as being one colour. However, if you have the training of an artist, you will know what you see is not what is actually there. There is no solid colour, and if you are looking at the table, you cannot see its precise shape, but only infer it. If you are inferring it is brown by the average light colour entering your eye, then does it cease to be brown if you turn off the light? What about if there is too much light and all you can see is the reflection off the polished surface? If you close one eye and look at its rectangular form from one of the long sides, you will not see right angle corners, but instead, a trapezium. We automatically adjust for this and classify objects when we see them. We apply our prejudices to them and lock them down to make reasoning about them easier. This is why object-oriented development is so appealing to us. However, what we find easy to consume as humans, is not optimal for a computer. When we think about game entities being objects, we think about them as wholes. But a computer has no concept of objects, and only sees objects as being badly organised data and functions randomly called on it.

If you take another example from the table, consider the table to have legs about three feet long. That's someone's standard table. If the legs are only one foot long, it could be considered to be a coffee table. Short, but still usable as a place to throw the magazines and leave your cups. But when you get down to one inch long legs, it's no longer a table, but instead, just a large piece of wood with some stubs stuck on it. We can happily classify the same item but with different dimensions into three distinct classes of object. Table, coffee table, a lump of wood with some little bits of wood on it. But, at what point does the lump of wood become a coffee table? Is it somewhere between 4 and 8 inch long legs? This is the same problem as presented about sand, when does it transition from grains of sand to a pile of sand? How many grains are a pile, are a dune? The answer must be that there is no answer. The answer is also helpful in understanding how a computer thinks. It doesn't know the specific difference between our human classifications because to a certain degree even humans don't.

The class of an object is poorly defined by what it is, but better by what it does. This is why duck typing is a strong approach. We also realise, if a type is better defined by what it can do, then when we get to the root of what a polymorphic type is, we find it is only polymorphic in terms of what it can do. In C++, it's clear a class with virtual functions can be called as a runtime polymorphic instance, but it might not have been clear that if it didn't have those functions, it would not need to be classified in the first place. The reason multiple inheritance is useful stems from this. Multiple inheritance just means an object can behave, that is react, to certain impulses. It has declared that it can fulfil some contract of polymorphic function response. If polymorphism is just the ability for an object to fulfil a functionality contract, then we don't need virtual calls to handle that every time, as there are other ways to make code behave differently based on the object.

In most games engines, the object-oriented approach leads to a lot of objects in very deep hierarchies. A common ancestor chain for an entity might be: PlayerEntity $\rightarrow$ CharacterEntity $\rightarrow$ MovingEntity $\rightarrow$ PhysicalEntity $\rightarrow$ Entity $\rightarrow$ Serialisable $\rightarrow$ ReferenceCounted $\rightarrow$ Base.

These deep hierarchies virtually guarantee multiple indirect calls when calling virtual methods, but they also cause a lot of pain when it comes to cross-cutting code, that is code that affects or is affected by unrelated concerns, or concerns incongruous to the hierarchy. Consider a normal game with characters moving around a scene. In the scene you will have characters, the world, possibly some particle effects, lights, some static and some dynamic. In this scene, all these things need to be rendered, or used for rendering. The traditional approach is to use multiple inheritance or to make sure there is a Renderable base class somewhere in every entity's inheritance chain. But what about entities that make noises? Do you add an audio emitter class as well? What about entities that are serialised vs those that are explicitly managed by the level? What about those that are so common they need a different memory manager (such as the particles), or those that only optionally have to be rendered (like trash, flowers, or grass in the distance). This has been solved numerous times by putting all the most common functionality into the core base class for everything in the game, with special exceptions for special circumstances, such as when the level is animated, when a player character is in an intro or death screen, or is a boss character (who is special and deserves a little more code). These hacks are only necessary if you don't use multiple inheritance, but when you use multiple inheritance you then start to weave a web that could ultimately end up with virtual inheritance and the complexity of state that brings with it. The compromise almost always turns out to be some form of cosmic base class anti-pattern.

Object-oriented development is good at providing a human oriented representation of the problem in the source code, but bad at providing a machine representation of the solution. It is bad at providing a framework for creating an optimal solution, so the question remains: why are game developers still using object-oriented techniques to develop games? It's possible it's not about better design, but instead, making it easier to change the code. It's common knowledge that game developers are constantly changing code to match the natural evolution of the design of the game, right up until launch. Does object-oriented development provide a good way of making maintenance and modification simpler or safer?

Internalised state

Claim: Encapsulation makes code more reusable. It's easier to modify the implementation without affecting the usage. Maintenance and refactoring become easy, quick, and safe.

The idea behind encapsulation is to provide a contract to the person using the code rather than providing a raw implementation. In theory, well written object-oriented code that uses encapsulation is immune to damage caused by changing how an object manipulates its data. If all the code using the object complies with the contract and never directly uses any of the data members without going through accessor functions, then no matter what you change about how the class fulfils that contract, there won't be any new bugs introduced by any change. In theory, the object implementation can change in any way as long as the contract is not modified, but only extended. This is the open closed principle. A class should be open for extension, but closed for modification.

A contract is meant to provide some guarantees about how a complex system works. In practice, only unit testing can provide these guarantees.

Sometimes, programmers unwittingly rely on hidden features of objects' implementations. Sometimes the object they rely on has a bug that just so happens to fit their use case. If that bug is fixed, then the code using the object no longer works as expected. The use of the contract, though it was kept intact, has not helped the other piece of code to maintain working status across revisions. Instead, it provided false hope that the returned values would not change. It doesn't even have to be a bug. Temporal couplings inside objects or accidental or undocumented features that go away in later revisions can also damage the code using the contract without breaking it.

Consider an implementation that maintained an internal list in sorted order, and a use case that accidentally relied on it (an unforeseen bug in the user's use case, not an intentional dependency), but when the maintainer pushes out a performance enhancing update, the only thing the users are going to see is a pile of new bugs, and they will likely assume the performance update is suspect, not their own code.

A concrete example could be an item manager that kept a list of items sorted by name. If the function returns all the item types that match a filter, then the caller could iterate the returned list until it found the item it wanted. To speed things up, it could early-out if it found an item with a name later than the item it was looking for, or it could do a binary search of the returned list. In both those cases, if the internal representation changed to something that wasn't ordered by name, then the code would no longer work. If the internal representation was changed so it was ordered by hash, then the early-out and binary search would be completely broken.

In many linked list implementations, there is a decision made about whether to store the length of the list or not. The choice to store a count member will make multi-threaded access slower, but the choice not to store it will make finding the length of the list an $\mathcal{O}(n)$ operation. For situations where you only want to find out whether the list is empty, if the object contract only supplies a get_count() function, you cannot know for sure whether it would be cheaper to check if the count was greater than zero, or check if the begin() and end() are the same. This is another example of the contract being too little information.

Encapsulation only seems to provide a way to hide bugs and cause assumptions in programmers. There is an old saying about assumptions, and encapsulation doesn't let you confirm or deny them unless you have access to the source code. If you have, and you need to look at it to find out what went wrong, then all the encapsulation has done is add another layer to work around rather than add any useful functionality of its own.

Instance oriented development

Claim: Making every object an instance makes it very easy to think about what an object's responsibilities are, what its lifetime looks like, and where it belongs in the world of objects.

The first problem with instance thinking is that everything is centred around the idea of one item doing a thing, and that is a sure way to lead to poor performance.

The second, and more pervasive issue with instance thinking is it leads to thinking in the abstract about instances, and using full objects as building blocks for thought can lead to very inefficient algorithms. When you hide the internal representation of an item even from the programmer using it, you often introduce issues of translation from one way of thinking about an object to another, and back again. Sometimes you may have an item that needs to change another object, but cannot reach it in the world it finds itself, so has to send a message to its container to help it achieve the goal of answering a question about another entity. Unfortunately, it's not uncommon for programs to lose sight of the data requirement along these routes, and send more than necessary in the query, or in the response, carrying around not only unnecessary permissions, but also unnecessary limitations due to related system state.

As an example of how things can go wrong, imagine a city building game where the population has happiness ratings. If each individual citizen has a happiness rating, then they will need to calculate that happiness rating. Let's assume the number of citizens isn't grossly overwhelming, with maybe a maximum of a thousand buildings and up to ten citizens per building. If we only calculate the happiness of the citizens when necessary, it will speed things up, and in at least one game where these numbers are similar, lazy evaluation of the citizen happiness was the way things were done. How the happiness is calculated can be an issue if it is worked out from the perspective of the individual, rather than the perspective of the city. If a citizen is happy when they are close to work, close to local amenities, far from industrial locations, and able to get to recreational areas easily, then a lot of the happiness rating comes from a kind of pathfinding. If the result of pathfinding is cached, then at least the citizens in the same building can benefit, but every building will have small differences in distances to each of the different types of building. Running pathfinding over that many instances is very expensive.

If instead, the city calculates happiness, it can build a map of distances from each of the types of building under consideration as a flood fill pass and create a general distance map of the whole city using a Floyd-Warshall algorithm to help citizens decide on how close their places of work are. Normally, substituting an $\mathcal{O}(n^{3})$ algorithm for an $\mathcal{O}(n^{2})$ could be seen as silly, but the pathfinding is being done for each citizen, so becomes $O(n^{2}m)$ and is not in fact algorithmically superior. Finally, this is the real world, and doing the pathing itself has other overheads, and running the Floyd-Warshall algorithm to generate a lookup before calculating happiness means the work to calculate happiness can be simpler (in data storage terms), and require fewer branches off into supporting code. The Floyd-Warshall algorithm can also have a partial update run upon it, using the existing map to indicate which items need to be updated. If running from the instance point of view, knowing a change to the topology or the type of buildings nearby would require doing some form of distance check per instance.

In conclusion, abstractions form the basis of solving difficult problems, but in games, we're often not solving difficult algorithmic problems at a gameplay level. To the contrary, we have a tendency to abstract too early, and object-oriented design often gives us an easy and recognisable way to commit to abstractions without rendering the costs apparent until much later, when we have become too dependent upon them to clear them away without impacting other code.

Hierarchical design vs change

Claim: Inheritance allows reuse of code by extension. Adding new features is simple.

Inheritance was seen as a major reason to use classes in C++ by game programmers. The obvious benefit was being able to inherit from multiple interfaces to gain attributes or agency in system objects such as physics, animation, and rendering. In the early days of C++ adoption, the hierarchies were shallow, not usually going much more than three layers deep, but later it became commonplace to find more than nine levels of ancestors in central classes such as that of the player, their vehicles, or the AI players. For example, in Unreal Tournament, the minigun ammo object had this:

Miniammo $\rightarrow$ TournamentAmmo $\rightarrow$ Ammo $\rightarrow$ Pickup $\rightarrow$ Inventory $\rightarrow$ Actor $\rightarrow$ Object

Game developers use inheritance to provide a robust way to implement polymorphism in games, where many game entities can be updated, rendered, or queried en-mass, without any hand coded checking of type. They also appreciate the reduced copy-pasting, because inheriting from a class also adds functionality to a class. This early form of mix-ins was seen to reduce errors in coding as there were often times where bugs only existed because a programmer had fixed a bug in one place, but not all of them. Gradually, multiple inheritance faded into interfaces only, the practice of only inheriting from one real class, and any others had to be pure virtual interface classes as per the Java definition.

Although it seems like inheriting from class to extend its functionality is safe, there are many circumstances where classes don't quite behave as expected when methods are overridden. To extend a class, it is often necessary to read the source, not just of the class you're inheriting, but also the classes it inherits too. If a base class creates a pure virtual method, then it forces the child class to implement that method. If this was for a good reason, then that should be enforced, but you cannot enforce that every inheriting class implements this method, only the first instantiable class inheriting it. This can lead to obscure bugs where a new class sometimes acts or is treated like the class it is inheriting from.

A feature missing from C++ also is the idea of being non-virtual. You cannot declare a function as not being virtual. That is, you can define that a function is an override, but you cannot declare that it is not an override. This can cause issues when common words are used, and a new virtual method is brought into existence. If it overlaps extant functions with the same signature, then you likely have a bug.

$\begin{linespread}{0.75}\lstinputlisting[language=C,caption={Runtime, compile-time, or link-time?},label=src:compiletime]{src/HARM_compiletime.cpp}\end{linespread}$

Another pitfall of inheritance in C++ comes in the form of runtime versus compile time linking. A good example is default arguments on method calls and badly understood overriding rules. What would you expect the output of the program in listing to be?

Would you be surprised to find out it reported a value of 10? Some code relies on the compiled state, some on runtime. Adding new functionality to a class by extending it can quickly become a dangerous game as classes from two layers down can cause coupling side effects, throw exceptions (or worse, not throw an exception and quietly fail), circumvent your changes, or possibly just make it impossible to implement your feature as they might already be taking up the namespace or have some other incompatibility with your plans, such as requiring a certain alignment or need to be in a certain bank of ram.

Inheritance does provide a clean way of implementing runtime polymorphism, but it's not the only way as we saw earlier. Adding a new feature by inheritance requires revisiting the base class, providing a default implementation, or a pure virtual, then providing implementations for all the classes that need to handle the new feature. This requires modification to the base class, and possible touching all of the child classes if the pure virtual route is taken. So even though the compiler can help you find all the places where the code needs to change, it has not made it significantly easier to change the code.

Using a type member instead of a virtual table pointer can give you the same runtime code linking, could be better for cache misses, and could be easier to add new features and reason about because it has less baggage when it comes to implementing those new features, provides a very simple way to mix and match capabilities compared to inheritance, and keeps the polymorphic code in one place. For example, in the fake virtual function go-forward, the class Car will step on the gas. In the class Person, it will set the direction vector. In the class UFO, it will also just set the direction vector. This sounds like a job for a switch statement fall through. In the fake virtual function re-fuel, the class Car and UFO will start a re-fuel timer and remain stationary while their fuelling-up animations play, whereas the Person class could just reduce their stamina-potion count and be instantly refuelled. Again, a switch statement with fall through provides all the runtime polymorphism you need, but you don't need to multiple inherit in order to provide different functionality on a per class per function level. Being able to pick what each method does in a class is not something inheritance is good at, but it is something desirable, and non inheritance based polymorphism does allow it.

The original reason for using inheritance was that you would not need to revisit the base class, or change any of the existing code in order to extend and add functionality to the codebase, however, it is highly likely you will at least need to view the base class implementation, and with changing specifications in games, it's also quite common to need changes at the base class level. Inheritance also inhibits certain types of analysis by locking people into thinking of objects as having IS-A relationships with the other object types in the game. A lot of flexibility is lost when a programmer is locked out of conceptualising objects as being combinations of features. Reducing multiple inheritance to interfaces, though helping to reduce the code complexity, has drawn a veil over the one good way of building up classes as compound objects. Although not a good solution in itself as it still abuses the cache, a switch on type seems to offer similar functionality to virtual tables without some of the associated baggage. So why put things in classes?

Divisions of labour

Claim: Modular architecture for reduced coupling and better testing

The object-oriented paradigm is seen as another tool in the kit when it comes to ensuring quality of code. Strictly adhering to the open closed principle, always using accessors, methods, and inheritance to use or extend objects, programmers write significantly more modular code than they do if programming from a purely procedural perspective. This modularity separates each object's code into units. These units are collections of all the data and methods that act upon the data. It has been written about many times that testing objects is simpler because each object can be tested in isolation.

However, we know it to be untrue, due to data being linked together by purpose, and purposes being linked together by data in a long chain of accidental relationships.

Object-oriented design suffers from the problem of errors in communication. Objects are not systems, and systems need to be tested, and systems comprise of not only objects, but their inherent communication. The communication of objects is difficult to test because in practice it is hard to isolate the interactions between classes. Object-oriented development leads to an object-oriented view of the system which makes it hard to isolate non-objects such as data transforms, communication, and temporal coupling.

Modular architecture is good because it limits the potential damage caused by changes, but just like encapsulation before, the contract to any module has to be unambiguous so as to reduce the chance of external reliance on unintended side effects of the implementation.

The reason object-oriented modular approach doesn't work as well is that the modules are defined by object boundary, not by a higher level concept. Good examples of modularity include stdio's FILE, the CRT's malloc/free, The NvTriStrip library's GenerateStrips. Each of these provides a solid, documented, narrow set of functions to access functionality that could otherwise be overwhelming and difficult to reason about.

Modularity in object-oriented development can offer protection from other programmers who don't understand the code. But why is a programmer that doesn't understand the code going to be safe even using a trivialised and simplified interface? An object's methods are often the instruction manual for an object in the eyes of someone new to the code, so writing all the important manipulation methods in one block can give clues to anyone using the class. The modularity is important here because game development objects are regularly large, offering a lot of functionality spread across their many different aspects. Rather than find a way to address cross-cutting concerns, game objects tend to fulfil all requirements rather than restrict themselves to their original design. Because of this bloating, the modular approach, that is, collecting methods by their concern in the source, can be beneficial to programmers coming at the object fresh. The obvious way to fix this would be to use a paradigm that supports cross-cutting concerns at a more fundamental level, but object-oriented development in C++ seems to be inefficient at representing this in code.

If object-oriented development doesn't increase modularity in such a way as it provides better results than explicitly modularising code, then what does it offer?

Reusable generic code

Claim: Faster development time through reuse of generic code

It is regarded as one of the holy grails of development to be able to consistently reduce development overhead by reusing old code. In order to stop wasting any of the investment in time and effort, it's been assumed it will be possible to put together an application from existing code and only have to write some minor new features. The unfortunate truth is any interesting new features you want to add will probably be incompatible with your old code and old way of laying out your data, and you will need to either rewrite the old code to allow for the new feature, or rewrite the old code to allow for the new data layout. If a software project can be built from existing solutions, from objects invented to provide features for an old project, then it's probably not very complex. Any project of significant complexity includes hundreds if not thousands of special case objects that provide all particular needs of that project. For example, the vast majority of games will have a player class, but almost none share a common core set of attributes. Is there a world position member in a game of poker? Is there a hit point count member in the player of a racing game? Does the player have a gamer tag in a purely offline game? Having a generic class that can be reused doesn't make the game easier to create, all it does is move the specialisation into somewhere else. Some game toolkits do this by allowing script to extend the basic classes. Some game engines limit the gameplay to a certain genre and allow extension away from that through data-driven means. No one has so far created a game API, because to do so, it would have to be so generic it wouldn't provide anything more than what we already have with our languages we use for development.

Reuse, being hankered after by production, and thought of so highly by anyone without much experience in making games, has become an end in itself for many game developers. The pitfall of generics is a focus on keeping a class generic enough to be reused or re-purposed without thought as to why, or how. The first, the why, is a major stumbling block and needs to be taught out of developers as quickly as possible. Making something generic, for the sake of generality, is not a valid goal. Making something generic in the first instance adds time to development without adding value. Some developers would cite this as short-sighted, however, it is the how that deflates this argument. How do you generalise a class if you only use it in one place? The implementation of a class is testable only so far as it can be tested, and if you only use a class in one place, you can only test that it works in one situation. The quality of a class's reusability is inherently untestable until there is something to reuse it, and the general rule of thumb is that it's not reusable unless there are at least three things using it. If you then generalise the class, yet don't have any other test cases than the first situation, then all you can test is that you didn't break the class when generalising it. So, if you cannot guarantee that the class works for other types or situations, all you have done by generalising the class is added more code for bugs to hide in. The resultant bugs are now hidden in code that works, possibly even tested in its isolation, which means any bugs introduced during this generalising have been stamped and approved, and are now trusted.

Test-driven development implicitly denies generic coding until the point where it is a good choice to do so. The only time when it is a good choice to move code to a more generic state, is when it reduces redundancy through reuse of common functionality.

Generic code has to fulfil more than just a basic set of features if it is to be used in many situations. If you write a templated array container, access to the array through the square bracket operators would be considered a basic feature, but you will also want to write iterators for it and possibly add an insert routine to take the headache out of shuffling the array up in memory. Little bugs can creep in if you rewrite these functions whenever you need them, and linked lists are notorious for having bugs in quick and dirty implementations. To be fit for use by all users, any generic container should provide a full set of methods for manipulation, and the STL does that. There are hundreds of different functions to understand before you can be considered an STL-expert, and you have to be an STL-expert before you can be sure you're writing efficient code with the STL. There is a large amount of documentation available for the various implementations of the STL. Most of the implementations of the STL are very similar if not functionally the same. Even so, it can take some time for a programmer to become a valuable STL programmer due to this need to learn another language. The programmer has to learn a new language, the language of the STL, with its own nouns verbs and adjectives. To limit this, many games companies have a much reduced feature set reinterpretation of the STL that optionally provides better memory handling (because of the awkward hardware), more choice for the containers (so you may choose a hash-map, trie, or b-tree directly, rather than just a map), or explicit implementations of simpler containers such as stack or singly linked lists and their intrusive brethren. These libraries are normally smaller in scope and are therefore easier to learn and hack than the STL variants, but they still need to be learnt and that takes some time. In the past this was a good compromise, but now the STL has extensive online documentation, there is no excuse not to use the STL except where memory overhead is very intrusive, such as in the embedded space where main memory is measured in kilobytes, or where compilation time is of massive concern^11.4.

The takeaway from this, however, is that generic code still needs to be learnt in order for the coder to be efficient, or not cause accidental performance bottlenecks. If you go with the STL, then at least you have a lot of documentation on your side. If your game company implements an amazingly complex template library, don't expect any coders to use it until they've had enough time to learn it, and that means, if you write generic code, expect people to not use it unless they come across it accidentally, or have been explicitly told to, as they won't know it's there, or won't trust it. In other words, starting out by writing generic code is a good way to write a lot of code quickly without adding any value to your development.

Richard Fabian 2018-10-08