I was recently shown a video by Scott Meyers on CPU caches, and the first ten minutes alone seems to be a reasonable push for understanding that DOD affects all langauges, not just C++ or C.
Link to the video https://vimeo.com/97337258 here for your viewing pleasure
Manipulating data in structure of arrays format can be unweildy for some, but this post talks about making things easier for you using some simple templating to replace the manual side of iteration through the arrays.
Read here C++ encapsulation for Data-Oriented Design: performance and learn about keeping your DOD SoA approach tidier.
It has become more obvious to people involved in optimisation that the x86 architecture is a difficult platform to understand at the core. This is partially because of the multitude of different CPUs out there that support the instruction set, each with their different timings, but also because of this latest breed of extraordinarily out of order CPUs. Knowing what's actually going to happen in an i7 has become a near impossible task.
Read Robert Graham's post on x86 is a high-level language and try to see why it's so very difficult to grok the flow of data in these chips, and also how it's very difficult to guess what will be the best performing algorithm without doing a lot of real world tests.
Nice read on why grep is quick. Some simple stuff, some awesome algorithm usage, and generally the kind of thing that you might want to keep in your head for if you come across a searching pattern that is similar to grep in any way.