Security Fiasco: Why small is necessary but not sufficient for security
“Small is beautiful” is nowhere more true than it is in security. The smaller a system’s trusted computing base (TCB), the more feasible it is to achieve a reasonable degree of security. The TCB is the part of the system that must be trusted to perform correctly in order to have any security in the system. It includes at least parts of the hardware (processor, memory, some devices) and the most privileged part of the software, i.e. the operating system (OS) kernel, and, for virtualised systems, the hypervisor.
This is one of the strongest justifications for microkernels: you want to absolutely minimise the OS part of the TCB, by using a kernel (the part of the OS running in privileged mode) that is kept as small as possible. Kernel size is minimised by only including the minimum functionality, i.e. fundamental mechanisms, and leaving implementation of other functionality, including all policy, to (unprivileged) user-mode software.
We have built the seL4 microkernel on this insight, and taken it to its logical conclusion, by mathematical proving the correctness (“bug freedom”) of the implementation, as well as general security properties. These proofs are critically enabled by seL4’s small size (about 10k SLOC).
But while (barring dramatic technological advances which are many years away) security requires small size, this doesn’t mean that a kernel is secure just because it’s a microkernel!
A case in point here is the Fiasco.OC microkernel (also called the L4Re Microkernel – Fiasco.OC is the kernel and L4Re is the userland programming environment built on top). Some propagate Fiasco.OC as the core of a national security competency and advocate it as the base for the development of a national security infrastructure. Sounds like a good idea, after all, it’s a microkernel?
Turns out, it’s not. Turns out, Fiasco.OC (besides being not all that “micro”, weighing in at about 20–35k SLOC depending on configuration) isn’t secure after all, and most likely can’t be made secure without a comprehensive re-design.
Details are provided in a paper by researchers from the TU Berlin (and interestingly the lead author is a former member of the Fiasco.OC team). It shows that Fiasco.OC’s memory management functionality provides covert storage channels of significant bandwidth.
The mechanisms underlying those channels are the result of design flaws of Fiasco.OC’s memory management, which, on the surface, is driven by security needs. Specifically, Fiasco.OC provides an abstraction of per-partition memory pools. However, these are part of a global management structure that breaks isolation. In short: Fiasco.OC’s “partitioned” memory pools ain’t.
The deeper cause for this breakdown of security is that Fiasco.OC violates a core microkernel principle: that of policy-mechanism separation. The microkernel is supposed to provide policy-free mechanisms, with which policies are implemented at user level. In contrast, Fiasco.OC’s memory management encapsulates a fair amount of policy, and it is exactly those policies which the TUB researchers exploited.
This is a nice example that demonstrates that you’ll pay eventually when deviating from the core principle. Usually the cost comes in the form of restricting the generality of the system. Here it is in the form of a security breakdown. (Ok, for purists, that’s also a form of restricting generality: Fiasco.OC is restricted to use in cases where security doesn’t matter much. Not very assuring for those propagating it as a national security kernel!)
Given the size of Fiasco.OC (2–3 times that of seL4) and the lack of understanding of security issues that seems to have affected the design of its memory management, one must suspect that there are more skeletons hiding in the security closet.
This all is in stark contrast to seL4, which has been designed for isolation from the ground up, including radically eliminating all policy from memory management. This approach has enabled a mathematical proof of isolation in seL4: In a so-called non-interference proof that applies to the executable binary (not just a model of the kernel), folks in my group have shown that a properly-configured seL4 system enforces isolation. The proof completely rules out attacks of the sort the TUB researchers have levelled against Fiasco.OC!
To be fair, seL4’s security proofs only rule out storage channels, while leaving the potential for timing channels. This is, to a degree, unavoidable: while it is, in principle, possible to prevent storage channels completely (as has been done with seL4), it is considered impossible to completely prevent timing channels. At least seL4 has a (still incomplete) story there (see our recent paper on assessing and mitigating timing channels), with more work in progress, while Fiasco.OC has known unmitigated timing channels (in addition to the storage channels discussed above).
Building security solutions on secure microkernel technology is good, and I am advocating it constantly. But it will only lead to a false sense of security if the underlying microkernel is inherently insecure. seL4, in contrast, is the only OS kernel with provable security. Use it as the base and you have at least a chance to succeed!