“There Were No Bugs!”
A few weeks ago I was talking to an engineer who had led the team that designed and built a recently-released OKL4-based mobile phone. Among the technical details, one comment stuck in my mind: “There were no bugs,” he said, and his face had the expression of “… and I still find that hard to believe!”
This may sound incredible, but it’s true. Through the several years they worked with OKL4, they never triggered a bug in our code. Not a single one!
There are two observations that can be drawn from this.
Firstly this is clearly a compliment to our engineering team, reflecting well on our engineers, but also the maturity of our software process. If we release something, it works. Despite the rapid development the system has gone through, with significant changes to the API. You won’t find many companies which can do this. You need true world-class engineering.
The second is a reflection on the fundamental approach taken with OKL4: a small (well-designed) code base that minimises the part that executes in privileged mode. This is not only good for the robustness and security of the deployed code (the usual argument for a small trusted computing base). It’s also a massive help for our own software process: a bug in the privileged code can manifest itself anywhere, inside or outside the kernel. That’s part of the reason why kernel code is so much harder to debug than user-mode code. Keeping it small makes debugging easier, and thus increases engineering productivity, and overall product quality.
Of course, assembler code is even harder to debug than the same number of lines of C code. Which is the reason that we are constantly reducing the amount of assembler code in the kernel, without sacrificing performance. Compare that to products which proudly state that their kernel is completely written in assembler. Imagine how expensive and error-prone it is for them to change anything in the system? They’ll find it hard to react to customer requirements.