sVirt Simulation Demo

In my last post on this topic I gave a quick description of a simulation of the sVirt architecture. Talking about it is only half the work. In this post I’ll show it in action and interpret the output as it relates to the separation goals.

Building and Installing

After you clone the git repo from my last post you’ll have all of the source code and policy for the simulation. Typing make in the root of the tree should build both the policy and the simulation code. Install it with make install.

There are differences between different Linux distros and how they do their SELinux stuff. This was developed and tested on Debian Squeeze.

Interface

The simulation is a bit sparse so the interface is just a textual one. Run it on the console by running it as root:

svirt-sim

The simulation is run as root because we’re simulating a virtualization management stack like libvirt or xend and makes and manages mock VHDs in /var/lib/svirt-sim/images.

The initial output you’ll see just a prompt of two hashes ## and a message indicating you can get a list of commands by pressing ?. With these commands you can interactively perform actions typical for a virtualization stack: create, delete, start, stop and show information about virtual machines:

press ? for a list of commands
##:?
c : create a VM
d : delete a VM
h : start a VM
k : stop a VM
s : show objects and labels
q : quit

Creating VMs

First things first, we need to create two VMs. When we create a VM the svirt-sim process creates a VHD for it and allocates an MCS category for the mock QEMU instance (that we call not-qemu) that will be started when the VM is started. The first one we’ll create will do what we expect it to do: access it’s VHD file simulating disk access. We’ll call this VM ‘good-one’ since it’s well behaved. The second VM we’ll create will be the ‘bad-one’ and it will attempt to violate confinement and access VHDs belonging to the other VM. Here’s the output we’ll see:

##:c
what's the name of the VM?: good-one
what's the vm's disposition [good|bad]?: good
creating: /var/lib/svirt-sim/images/good-one.vhd
##:c
what's the name of the VM?: bad-one
what's the vm's disposition [good|bad]?: bad
creating: /var/lib/svirt-sim/images/bad-one.vhd

Now that we’ve created our VMs we can examine the internal state of the management stack by pressing ‘s’:

##:s
======================
VMs:
VM Name:        bad-one
  VHD Name:     /var/lib/svirt-sim/images/bad-one.vhd
  Disposition:  bad
  VM qemu PID   0
  VM Status:    Stopped
  mcs category:     s0:c850
  prev vhd context: (null)
VM Name:        good-one
  VHD Name:     /var/lib/svirt-sim/images/good-one.vhd
  Disposition:  good
  VM qemu PID   0
  VM Status:    Stopped
  mcs category:     s0:c440
  prev vhd context: (null)
======================
=============================
dumping security context list
  mcs: s0:c850
  mcs: s0:c440
=============================

We can also see the mock VHDs created by the management stack and examine their security context:

-rw-r-----. 1 root root unconfined_u:object_r:svirt_sim_image_t:s0:c0 0 Sep  3 23:27 bad-one.vhd
-rw-r-----. 1 root root unconfined_u:object_r:svirt_sim_image_t:s0:c0 0 Sep  3 23:27 good-one.vhd

We use the first category (c0) as the context for VMs that aren’t running. No VM should ever be run with this category so the VHD of a running VM should never be accessible to a not-qemu instance.

Running VMs

The VMs created above can be run by pressing ‘h’. Once they’re running press ‘s’ again to see the change in the internal state of the management stack. We’ll now see pid of the mock QEMU process started.

##:h
what's the name of the VM?: good-one
starting vm: good-one
security_start for vm with mcs: s0:c440
started child with pid 1298
##:h
what's the name of the VM?: bad-one
starting vm: bad-one
security_start for vm with mcs: s0:c850
started child with pid 1300
##:s
======================
VMs:
VM Name:        bad-one
  VHD Name:     /var/lib/svirt-sim/images/bad-one.vhd
  Disposition:  bad
  VM qemu PID   1300
  VM Status:    Running
  mcs category:     s0:c850
  prev vhd context: unconfined_u:object_r:svirt_sim_image_t:s0:c0
VM Name:        good-one
  VHD Name:     /var/lib/svirt-sim/images/good-one.vhd
  Disposition:  good
  VM qemu PID   1298
  VM Status:    Running
  mcs category:     s0:c440
  prev vhd context: unconfined_u:object_r:svirt_sim_image_t:s0:c0
======================
=============================
dumping security context list
  mcs: s0:c850
  mcs: s0:c440
=============================

Now that the simulated “VMs” are running we can view their contexts and those of their VHDs to be sure they match up to the internal state of the svirt-sim stack:

# ls -l /var/lib/svirt-sim/images
-rw-r-----. 1 root root system_u:object_r:mcs_server_image_t:s0:c850 0 Sep  3 23:27 bad-one.vhd
-rw-r-----. 1 root root system_u:object_r:mcs_server_image_t:s0:c440 1 Sep  3 23:28 good-one.vhd
ps auxZ | grep not-qemu
system_u:system_r:not_qemu_t:s0:c440 root 1298  0.0  0.1   1664   544 ?        Ss   23:28   0:00 /sbin/not-qemu --good --file=/var/lib/svirt-sim/images/good-one.vhd
system_u:system_r:not_qemu_t:s0:c850 root 1300  0.0  0.1   1696   544 ?        Ss   23:28   0:00 /sbin/not-qemu --bad --file=/var/lib/svirt-sim/images/bad-one.vhd

So far the output looks as we’d expect: the category assigned to the running not-qemu process matches the category of it’s VHD file.

Testing Confinement

Built in to this simulation is a simple confinement test. The not-qemu process has an option I call it’s disposition. If started with the option --good it will only attempt to access the VHD file it’s been assigned. If started with the --bad option then it will cycle through the directory containing all of the VHDs and it will attempt to access all of them. This is meant to simulate a malicious QEMU process reading the disk of another VM. Above we’ve started two VMs, one good and one bad. We can see the evidence of their behavior in the system logs.

The syslog on my system shows the following trace of accesses made by each of the not-qemu processes. I’ve removed the time stamps but the messages should be considered sequential.

/sbin/not-qemu[1298]: opening VHD file: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1298]: not-qemu with pid: 1298 still alive
/sbin/not-qemu[1300]: new file path: /var/lib/svirt-sim/images/.
/sbin/not-qemu[1300]: opening VHD file: /var/lib/svirt-sim/images/.
/sbin/not-qemu[1300]: open failed: Is a directory
/sbin/not-qemu[1298]: opening VHD file: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1298]: not-qemu with pid: 1298 still alive
/sbin/not-qemu[1300]: new file path: /var/lib/svirt-sim/images/..
/sbin/not-qemu[1300]: opening VHD file: /var/lib/svirt-sim/images/..
/sbin/not-qemu[1300]: open failed: Is a directory
/sbin/not-qemu[1298]: opening VHD file: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1298]: not-qemu with pid: 1298 still alive
/sbin/not-qemu[1300]: new file path: /var/lib/svirt-sim/images/bad-one.vhd
/sbin/not-qemu[1300]: opening VHD file: /var/lib/svirt-sim/images/bad-one.vhd
/sbin/not-qemu[1300]: not-qemu with pid: 1300 still alive
/sbin/not-qemu[1298]: opening VHD file: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1298]: not-qemu with pid: 1298 still alive
/sbin/not-qemu[1300]: new file path: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1300]: opening VHD file: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1300]: open failed: Permission denied
kernel: [ 2678.904913] type=1400 audit(1315107146.430:22): avc:  denied  { read write } for  pid=1300 comm="not-qemu" name="go
od-one.vhd" dev=sda3 ino=1444187 scontext=system_u:system_r:not_qemu_t:s0:c850 tcontext=system_u:object_r:mcs_server_image_t:s
0:c440 tclass=file
/sbin/not-qemu[1298]: opening VHD file: /var/lib/svirt-sim/images/good-one.vhd
/sbin/not-qemu[1298]: not-qemu with pid: 1298 still alive
/sbin/not-qemu[1300]: new file path: /var/lib/svirt-sim/images/.
/sbin/not-qemu[1300]: opening VHD file: /var/lib/svirt-sim/images/.
/sbin/not-qemu[1300]: open failed: Is a directory
/sbin/not-qemu[1298]: not-qemu with pid 1298 stopping on user request
/sbin/not-qemu[1298]: not-qemu with pid 1298 shutting down
/sbin/not-qemu[1300]: not-qemu with pid 1300 stopping on user request
/sbin/not-qemu[1300]: not-qemu with pid 1300 shutting down

There’s a bit of noise here in that ‘bad’ not-qemu instances loop through the images directory contents and don’t differentiate between files and directories so it will attempt to open and read the special directories . and .. which I’ll clean up in the future.

The important bits are that we can see the ‘bad’ not-qemu instance with PID 1300 attempt to access a VHD that doesn’t belong to it. This results in the operation failing and an AVC being displayed.

Conclusion

Whenever I’m writing SELinux policy I’m always looking for ways to test it. Policy analysis that proves a policy fulfills a specific security goal is the holy grail of security and this simulation isn’t policy analysis. Instead it’s just a simple simulation giving an example of what we’ve discussed using logical representation of the mlsconstrain statements in the SELinux MCS policy. Specifically, the category component of a processes context must dominate that of the object it’s accessing. We’ve represented this in logic previously now this simulation shows that the security server in the Linux kernel really does enforce this constraint.

Another interesting use of this simulation would be to use the real QEMU policy from the reference policy and a specifically crafted binary to push the limits of the policy. This binary could contain any number of known ways information can be leaked across QEMU instances. If anyone out there tries this I’d love to hear about it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s