VMware MVP: What it really is
Recently I had a look at what has become of VMware’s MVP and explained the security shortcomings of the Type-2 hypervisor design. Today I’m looking at VWware’s approach in more detail, and explain why it is in fact not a real Type-2 hypervisor, and what this implies.
Type-2 hypervisors are known for poor performance. The reason I had explained in detail a while back, I’ll summarise them here (refer to the earlier blog for more details).
A system call performed by an application is a privileged operation which is intercepted by the hypervisor, which (after deciding that this is an operation which should be handled by the guest) forwards it to the guest OS. The return to user mode from the guest takes a similar detour through the hypervisor, as indicated in the left part of the diagram.
In the case of a Type-1 hypervisor, this results in a total of four mode switches and two context switches. However, in the case of a Type-2 hypervisor, the system call is trapped by the host OS, which delivers it to the hypervisor, and a return from the hypervisor to either the guest or the app similarly takes a detour via the host. All up, the number of mode switches and context switches is doubled, as indicated in the right part of the diagram. Further cost arises from the fact that while a Type-1 hypervisor (such as OKL4) is highly optimised for this trampolining, the host OS generally isn’t. In reality, the overhead of doing a simple system call is in the Type-2 case not just double that of the Type-1, but closer to an order of magnitude higher. This is why virtualization with a Type-2 hypervisor is generally slow. Note that ARM’s forthcoming architecture extensions to support virtualization (I’ll discuss them in a future blog) help to reduce the overheads of a Type-1 hypervisor, but do little to help a Type-2.
VMware understands this, and has taken a different approach in MVP, which I’ll explain now.
Fundamentally, the high cost of Type-2 virtualization stems from the fact that the hypervisor effectively consists of two parts, the host OS and the hypervisor proper, that each (logical) hypervisor invocation bounces twice between those layers, and that the host mechanisms used for this bouncing are inefficient. So, what VMware does in MVP is to merge the hypervsior back in with the host.
This is done by loading a MVP module (called “MVPkm”) into the host OS kernel, as shown in the diagram to the right. (They discuss this for Android, it is not clear whether they plan to support other hosts, such as Windows or Symbian. If they do, they’ll have to redo the kernel module for each host.) The MVP module effectively hijacks the host, by re-writing the exception vectors, so it obtains control whenever the guest kernel is entered. (Note: this is exactly what a piece of malware would do.) The process turns the host kernel into a hypervisor.
The result is not really a Type-2 hypervisor any more, as it actually runs native, not on top of a host OS (but inside) and has direct control over physical resources (rather than the virtualized resources provided to it by the host). However, it it isn’t a Type-1 hypervisor either, as it does not have exclusive control over the hardware, this is shared with the rest of the host, and any code inside the host kernel can interfere with the operation of the hypervisor module.
So, if this hypervisor is neither a Type-2 nor a Type-1, what is it? I call it a hybrid hypervisor, as it is somewhat of a blend of the two basic types. A better-known representative of the hybrid hypervisor type is the widely-used KVM (often falsely referred to as a Type-2 hypervisor). It operates very similarly, although KVM is dependent on virtualizaiton extensions to the architecture (MVP is not, but can make use of them).
The hybrid hypervisor can achieve similar performance as a Type-1 hypervisor, so this scheme seems pretty neat at first glance. The problem is that this performance is bought at a heavy price.
The one advantage a Type-2 hypervisor has over a Type-1 is that it can be easily installed: for the host OS it’s just another app, and it is installed just like an app, without requiring any special privileges.
This advantage is lost with the hybrid approach. It requires inserting a kernel module into the host OS, which is a highly security-critical operation (after all, it is the same as installing a root kit into the kernel!) As such it requires special privileges. On a mobile phone it requires cooperation with the device vendor or network operator, as they try very hard to prevent the unauthorised insertion of malware-like code into the OS!
While losing the ease-of-install advantage of the Type-2 to buy Type-1-like performance, the hybrid hypervisor inherits all the other drawbacks of the Type-2 hypervisor, especially the huge size of the trusted computing base. Everything in the host OS (all of a million or so lines of code!) needs to be trusted, a huge attack surface. So, while MVP is a hybrid hypervisor rather than a real Type-2, everything about the drawbacks of VMware’s approach I discussed in the earlier blog and its successor remains valid!
In summary, the hybrid approach taken with MVP has no discernible advantage over a lightweight, high-performance Type-1 hypervisor such as OKL4. MVP still requires manufacturer/MNO cooperation to install (unlike a real Type-2). It can, in theory, reach the performance of OKL4, although I’ll believe that when I see it, given that OKL4′s performance is so much better than anything else I’ve seen. But the fundamental weakness of the hybrid approach, which it shares with proper Type-2 hypervisors, is that it adds nothing to security of the guest apps, they are every bit as exposed as if they were running directly on the host. Which begs the question: Why bother?
Speaking of attacks, if you think carefully about it, you realise that MVP might very well increase the exposure of handsets to malware. Put yourselves in the shoes of a blackhat and think about how to get a rootkit onto a handset. If you know that a handset is provisioned to have MVP loaded on it, you know that it has provision for loading the MVP kernel module. It might well be that the easiest way to crack the system is to write a rootkit module which masquerades as MVPkm. I’ll sure stay away from such phones!
In a future blog I will investigate how each type of hypervisor does (or doesn’t) support the various use cases for mobile virtualization. Stay tuned, and drop me a line if you have questions.
VMware’s MVP—Encryption Doesn’t Make It Secure!
Last week I talked about the backwards step VMware is taking by implementing their long-overdue mobile virtualization platform (MVP) as a Type-2 hypervisor. In the meantime, an insightful blog (which liberally quotes from my blog, although without attribution) talks about their use of encryption to try to protect user (actually, enterprise) data. I’ll explain here why this is just window-dressing, providing an appearance of security rather than the real thing.
VMware say they encrypt the guest’s data on flash and also use an encrypted VPN tunnel to connect to the enterprise network. Surely, this will protect the data from attacks?
Surely not. This is akin to thinking that the data on your Windows laptop is safe from rootkits because the disk is encrypted. It ain’t. Where encrypting the disk helps is if you lose your laptop and someone finds/steals it and breaks into it. If your OS gets infected by malware, it helps zilch. ‘Cause in order to be processed, the data is loaded into memory and decrypted. And there it is fully accessible by the OS, and if that OS is infected, there’s no way to stop the malware from seeing (and leaking) your data.
Same story on the phone with the Type-2 hypervisor. The hypervisor can encrypt the guest’s data until the cows come home, that doesn’t protect it from malware infecting the hypervisor or the host OS underneath. If the host gets cracked, the hypervsior gets cracked. If the hypervisor gets cracked, you lose. No way around this fundamental truth. And the inconvenient bit of the truth is that the host+Type-2 presents a huge attack surface. While for a well-designed Type-1 hypervisor, such as the OKL4 Microvisor, that attack surface is tiny, about two orders of magnitude smaller. Take your pick!
So, what is an MVP-style solution good for? I’ll look at this later, but first need to take a more in-depth (and rather technical) look at VMware’s approach. Stay tuned!
Much Ado About Type-2
VMware has finally lifted the lid on their long-promised mobile virtualization platform (MVP). And, surprise, it’s a Type-2 hypervisor! This is a bit of a let-down, and has some interesting implications on what MVP can (or rather cannot) do, which I’m going to explore in a few blogs.
First a bit of background. Observers of the mobile virtualization space will remember that about two years ago, VMware, better known for server and desktop virtualization products, bought our then competitor Trango. At the time they promised MVP-based products “should arrive in around 12 to 18 months“. That’s phones with MVP on it. Almost 24 months later, there isn’t even a product announcement for MVP. It’s been a bit like waiting for Godot…
In the meantime, the OKL4 Microvisor has been around for yonks. It’s available, it’s benchmarkable, it’s being deployed—it’s real. And, as befits something with “L4” in the name, it defines the state of the art of hypervisors for embedded systems.
Well, at last (least?) VMware presented their vision, accompanied by a demo, at a BOF at last week’s OSDI conference in Vancouver. Not exactly a high-profile announcement. And it’s a Type-2 hypervisor!
I’ve discussed Type-1 vs Type-2 in a blog a year ago, and another one a few months earlier, and will probably explore this topic a bit more in a future blog. For now I’ll focus on what VMware is trying to sell, and why it doesn’t actually doesn’t solve the problem they claim they are addressing. Further technical discussion will look at why they taking this particular stance. (Hint: If all you’ve got is a hammer, everything looks like a nail. Even an egg…)
Hypervisors (also called virtual machine monitors) are designed to provide multiple virtual machines which can each run an OS with all of its apps. The fundamental difference between a Type-1 hypervisor (such as OKL4) and a Type-2 is that the former runs on bare metal, between the hardware and the operating system(s). In contrast, a Type-2 hypervisor runs on top of an OS (which is why it’s also called a “hosted” hypervisor).
That difference is much more significant than it may seem. It implies a completely different relationship between the hypervisor and the various operating systems. With Type-1, the hypervisor is master, it controls the OSes (called “guests”). With Type-2, the master is an OS (the one which hosts the hypervisor), it controls the hypervisor, which can only control the other OSes. Keep this in mind.
So, what problems is VMware (pretending) to solve with their Type-2 hypervisor? The main use case they are highlighting is BYOD, “bring your own device”. (Yes, they adopted the terminology we introduced 18 Months ago—good on them!)
The motivation for BYOD is that smartphones have business as well as private use. People like to control their private phones: They want to decide on the type and model, and they want to install their choice of apps. In contrast, companies like control over the phones used for business: They want to decide the model (ideally a single one for everybody) and what software runs on them. This forces an increasing number of people to carry two phones, business and private.
The idea of BYOD is that a single phone can serve both purposes: a person buys a phone of their choice, takes it to their company’s IT dudes, and they install a virtual business phone on the BYOD handset. Sounds great, doesn’t it?
The devil is in the detail, and it’s those details which make MVP a non-solution.
Why do companies want control over the phone? There’s only one reason: security. The whole point of issuing smartphones to employees is to keep them linked into the enterprise IT infrastructure while they are on the move. Traditionally this is all about email, address books and calendars, but increasingly it is a much deeper integration, enabling the phone to access employee records, sales databases, engineering designs—anything you’d access from your computer screen in the office.
So, the bottom line is that companies are worried about the security and integrity of their data when accessed via the mobile device (phone, tablet or whatever it might be). They are worried that accessing this critical data from an uncontrolled phone puts the critical enterprise information at risk. And they are right: phones do get infected by malware, and with each application installed, the risk of infection increases. This is the core challenge BYOD must address.
Surely, VMware understands this? Maybe they do, but if so, why do they offer solution which doesn’t cut the mustard?
The reason I say this is that the BYOD model VMware is propagating does nothing to solve this fundamental security issue, while OKL4 does.
This is illustrated in the figures at the left. With OKL4, the (Type-1) hypervisor is in control of all hardware. It isolates the VMs and their OSes from each other. If the user gets their private OS infected, that’s tough for them, but the infection cannot spread across VMs to the business environment. In order to subvert this, the attacker either has to have already subverted some of the enterprise IT infrastructure (thus coming in from the business side into the business OS) or has to attack the hypervisor from the private VM. But the hypervisor has an extremely small attack surface! The hypervisor is very small (about 10,000 lines of code). Technically speaking, the business VM has a small trusted computing base (TCB).
In VMware’s Type-2 model, it’s quite different. The business environment is controlled by the hypervisor, which is controlled by the host OS (the one that comes with the BYOD phone). If this gets cracked, as it inevitably will be, then it’s trivial to crack the hypervisor, and then you control the business OS! The reason this is easy to crack is that in this setup, the business OS has a huge TCB. It includes the complete private OS, which likely comprises upwards of 1,000,000 lines of code—two orders of magnitude more than OKL4!
Now remember where we’re coming from. The original motivation for BYOD was that companies don’t trust people’s private phones with critical business data, because these phones get cracked, which would compromise the business data. The idea of BYOD, as promoted by OK Labs, is to provide a virtual business phone on the private handset which is just as secure as if it was a physically separate handset.
If you followed my argumentation above, you’ll see that VMware’s solution is no bit more secure than allowing people to access the business data through their normal private phones, without the detour via a hypervisor. In other words, MVP adds nothing to security. So why would you pay for it then? You might as well cut out the middle man and allow people to access the enterprise IT system from their unmodified private phones. Security-wise, there is no difference whatsoever.
At OK Labs, we believe that security isn’t something that’s solved with PR. It requires a technically-sound approach. It requires a minimal TCB. It requires OKL4.
Stay tuned for a more in-depth look at these issues.



