Measured Launch on OE core

It’s been 4 months since my last post but I’ve been working on some fun stuff. Said work has progressed to the point where it’s actually worth talking about publically so I’m crawling out from under my favorite rock and putting it “out there”.

My last few bits of writing were about some random OpenEmbedded stuff, basically outlining things I was learning while bumbling my way through the OE basics. I’ve been reading through the meta-selinux and meta-virtualization layers and they’re a great place to learn. Over the winter Holiday here I had some extra vacation time from my day job to burn so I finally got serious about a project I’ve been meaning to start for way too long.

meta-measured

Over the past year I’ve been thinking a lot about the “right way” to measure a software system. We’ve implemented a measurement architecture on XT but this has a few down sides: First a system as large as XT is very difficult to use as a teaching tool. It’s hard to explain and show someone the benefits of measuring a system when your example is large, complex and the relevant bits are spread throughout the whole system. Even our engineers who know our build system inside and out often get lost in the details. Second the code belongs to Citrix and closed source software isn’t very useful to anyone except the people selling it.

So after reading through the meta-selinux and meta-xen layers a bunch and learning a good bit about writing recipes I’ve started work on a reference image for a “measured system”. I’m keeping the recipes that make up this work in a layer I call ‘meta-measured’. For this first post on the topic of measured systems I’ll stick to discussing the basic mechanics of it’s construction. This includes some data on the supporting recipes and some of the component parts necessary for booting it. Hopefully along the way I’ll be able to justify the work by discussing the potential benefits to system security but the theory and architecture discussions will be left for a later post.

get the source

If you’re interested in just building it and playing with the live image this is where you should start. Take a look and let me know what you think. Feedback would be much appreciated.

All of the work I’ve done to get this first bootable image working is up on my github. You can get there, from here: https://github.com/flihp. The ‘meta-measured’ layer is here: https://github.com/flihp/meta-measured.git. To automate setting up a build environment for this I’ve got another repo with a few scripts to checkout the necessary supporting software (bitbake / OE / meta-intel etc), a local.conf (which you may need to modify for your environment), and a script to build the ‘iso’ that can be written to a USB drive for booting a test system: https://github.com/flihp/measured-build-scripts.

The best way to build this currently is to checkout the measured-build-scripts repo:

git clone git://github.com/flihp/measured-build-scripts.git

run the ‘fetch.sh’ script to populate the required git submodules and to clone the meta-measured layer:

cd measured-build-scripts
./fetch.sh

build the iso

If you try to run the ./build.sh script next as you would think you should, the build will fail currently. It will do so while attempting to download the SINIT / ACM module for TXT / tboot because Intel hides the ACMs behind a legal terms wall with terms that must be accepted before the files can be downloaded. I’ve put the direct link to it in the recipe but the download fails unless you’ve got the right cookie in your browser so wget blows up. Download it yourself from here: http://software.intel.com/en-us/articles/intel-trusted-execution-technology, then drop the zip into your ‘download’ directory manually. I’ve got the local.conf with DL_DIR hardwired to /mnt/openembedded/downloads so you’ll likely want to change this to suit your environment.

Anyway I’ll sort out a way to fool the Intel lawyer wall eventually … I’m tempted to mirror these files since the legal notice seems to allow this but I don’t really have the bandwidth ATM. Once you’ve got this sorted, run the build.sh script. I typically tee the output to a file for debugging … this is some very ‘pre-alpha’ stuff so you should expect to debug the build a bit 🙂

./build.sh | tee build.log

This will build a few images from the measured-image-bootimg recipe (tarballs, cpios, and an iso). The local.conf I’ve got in my build directory is specific to my test hardware so if you’ve got an Intel SugarBay system to test on then you can dump the ISO directly to a USB stick and boot it. If you don’t have a SugarBay system then you’ll have to do some work to get it booting since this measured boot stuff is closely tied to the hardware, though the ACMs I’ve packaged work for 2nd and 3rd gen i5 and i7 hardware (Sandy and Ivy Bridge).

recipes

I’ve organized the recipes that make up this work into two categories: Those that are specific to the TPM and those that are specific to TXT / tboot. Each of these two technologies requires some kernel configs so those are separated out into fragments like I’ve found in other layers. My test hardware has USB 3.0 ports which the base OE layers don’t seem to have yet. I’ve included this config in my oe-measured distro just so I can use the ports on the front of my test system.

The TPM recipes automate building the Trousers daemon, libtspi and some user space tools that consume the TSS interface. Recipes for the TPM software are pretty straight forward as most are autotools projects. Some work was required to get the trousers project separated into packages for the daemon and library.

The tboot recipes were a bit more work because tboot packages a bunch of utilites in the main tboot source tree so they had to be separated out into different packages (this work is still on-going). Further tboot doesn’t use autotools and they squash most compiler flags that the OE environment passes in. The compler flags required by tboot are static which stands at odds with OE and a cross-compiled environment that wants to change the path to everything including the compiler.

I’ve no clue if tboot will build properly on anything other than an Intel system. Further the issue of Intel hiding the ACMs required for their chipssets behind an EULA wall is annoying as the default OE fetcher won’t work.

images

My first instinct is always to to describe a system by construction: from the bottom up. In this case I think going top-down is a better approach so we’ll start with the rootfs and work backwards. The TPM recipes includes two images based on the core-image from OE core. That’s one initramfs image and one rootfs. The rootfs is just the core-image with the TPM kernel drivers, trousers daemon, tpm-tools and the tpm-quote-tools. I haven’t done much with this rootfs other than booting it up and see if TXT and the TPM works as expected.

There’s also an initramfs with the TPM kernel drivers, trousers daemon and the tpm-tools but not the quote tools. This is a very minimal initramfs with the TSS daemon loaded manually in the initrd script. It’s not expected that users will be using the tpm-tools interactively here but that’s what I’ve been doing for initial testing. Only the tpm_extendpcr tool (open source from Citrix) is used to extend a PCR with the sha1sum hash of the rootfs before the call to switch_root. This requires that the ‘coreutils’ package be included just for the one utility which bloats the initramfs unfortunately. Slimming this down should’t be too much work in the future. Anyway I think this is ‘the right way’ to extend the measurement chain from the initramfs up to the rootfs of the system.

The rest of the measruements we care about are taken care of by the components from the TXT recipes. There’s only one image in the TXT recipe group however. This is derived from the OE core live image and it’s intended to be ‘deployable’ in the lanugage of OE recipes. I think this means an hddimg or an ISO image, basically something you can ‘dd’ to disk and boot. Currently it’s the basis for a live image but could easily be used for something like an installer simply by switching out the rootfs.

This image is not a separate root filesystem but instead it’s an image created with the files necessary to boot the system: syslinux (configured with the mboot.c32 comboot module), tboot, acms and the initrd and the rootfs from the TPM recipes. tboot measures the bootloader config, all of the boot modules and a bunch of other stuff (see the README in the tboot sources for details). It stores these measurements in the TPM for us, creating the ‘dynamic root of trust for measurement’ (DRTM).

Once tboot has measured all of the modules, the initramfs takes over. The initramfs then measures the rootfs as described above before the switch to root. I’ve added a few kernel parameters to pass the name of the rootfs and the PCR where it’s measurement is to be stored.

If the rootfs is measured on each boot it must be mounted read-only to prevent its measurement from changing … yup even mounting a journaled file system read-write will modify the journal and change the filesystem. Creating a read-only image is a bit of work so for this first prototype I’ve used a bit of a short cut: I’ve mounted the rootfs read only, create a ramfs read write, then the two are combined in a unionfs. In this configuration when rootfs boots it looks like a read / write mount. Thus on each boot the measurements in the TPM are the same.

Next Steps

Measuring a system is all well and good but who cares? Measurements are only useful when they’re communicated to external parties. For now this image only takes measurements and these measurements are the same on each boot. That’s it. Where this can be most immediately useful is that these measurements can be predicted in the build.

The PCRs 0-7 are reserved for the BIOs and we have no way of predicting these values currently as they’re unique to the platform and that’s messy. The tboot PCRs however (17, 18 and 19 in the Legacy mapping currently used) can be calculated based on the hashing done by tboot (read their docs and http://www.mail-archive.com/tboot-devel@lists.sourceforge.net/msg00069.html). The PCR value containing the measurement of the rootfs can be calculated quite simply as well.

For a reference live image this is interesting only in an academic capacity. As I suggest above, this image can be used as a template for something like an installer which would give the predictability of PCR values much deeper meaning: Consider an installer architecture where the installer itself is a very small rootfs that downloads the install package from a remote server (basically Debian’s netboot iso or a PXE boot setup). Assuming we have a method for exchanging system measurements (more future work) it would be very useful for the remote server to be able to evaluate measurements from the installer before releasing the install package.

This is probably a good place to wrap up this post. The meta-measured layer I’ve described is still very new and the images I’ve built are still usefuly only for ‘tire-kicking’. My next post will hopefully discuss predicting measurement values in the build system and other fun stuffs.

Openembedded Yocto Native Hello World: Take 2

A while back I wrote about some problems I was having with native OpenEmbedded recipes that were building packages with raw Makefiles (no autotools). I wrote up the problem and the work around I was using here. I got some feedback pointing out what I was doing wrong but I guess my brain just didn’t process it. I was using uppercase variables while the GNU Make docs specifically call for lowercase! These variables get magically passed to Make likely through the environment and then everything just works.

So here’s my retraction: I was wrong 🙂 The ‘hello world’ Makefile should look like this:

.PHONY : clean uninstall

prefix ?= /usr
exec_prefix ?= $(prefix)
bindir ?= $(exec_prefix)/bin

HELLO_src = hello.c
HELLO_bin = hello
HELLO_tgt = $(DESTDIR)$(bindir)/$(HELLO_bin)

all : $(HELLO_bin)

$(HELLO_bin) : $(HELLO_src)

$(HELLO_tgt) : $(HELLO_bin)
	install -d $(DESTDIR)$(bindir)
	install -m 0755 $^ $@

clean :
	rm $(HELLO_bin)

install : $(HELLO_tgt)

uninstall :
	rm $(HELLO_TGT)

You can download the recipe here: hello_1.1.bb.

TXT Capable Desktop Virtualization System

Having worked on XenClient XT for the past year I’ve experienced the pain of debugging vendors TXT implementations first hand. TXT may be a nearly 6 year old technology but it’s just now coming into use and many vendors platforms have only received internal testing. We’ve found a number of ways for platforms to fail in strange ways and we’ve had to work with the vendors to get their implementations working for a system like XT that uses tboot as part of our measured launch.

For development Citrix has provided me with a number of systems but I’ve been meaning to put one together for myself for some time now. I’ve always liked building my own so I wasn’t thrilled with the prospect of purchasing a Dell / HP system. Home builds are always a bit cooler, a bit cheaper, and more fun in general. That said I was a bit worried about being able to find a motherboard / CPU combo with full AND WORKING VT-x, VT-d and TXT. It wasn’t as bad as I expected. So the following is a breakdown of the home build system I put together specifically to run XT.

Case

I always start building systems with the case. This will dictate size which in turn limits your choices for motherboards. I’ve had a string of successes building systems in Lian Li cases so again they were my first choice. I wanted this system to be as small as possible. Lian Li happens to make probably the best mini-ITX case on the market: the PC-Q02A. This case is tiny and it comes bundled with a 300W power supply. No room in the back of the case for PCI cards either so if you buy this don’t expect to throw a graphics card in it. Whatever you need has to be on the motherboard!

CPU

Since I intend to run XT on this system the CPU has to support the full Intel vPro suite including TXT. This limited me to high-end intel i5 and i7 processors. Since this system will be in a small, low power case I wanted a 65W CPU and went with the Intel i7-2600S. CPUs aren’t really where you want to save money on your build so I didn’t skimp here.

Motherboard

The motherboard is really where vPro and TXT are either made or broken. The BIOS is where CPU features are either enabled or disabled and many motherboard vendors don’t list anything in their docs about TXT compatibility. This is mostly because home users typically don’t really care. In this case we do so some research is required. I played it safe and went with an Intel DQ67EP. TXT and the TPM worked flawlessly. One thing that was a deviation from the platforms from Dell and HP I’ve played with was the TPM came without an EK populated. It’s a simple case of running tpm_createek on the system but because all of the vendor platforms come with an EK pre-populated the XT code doesn’t account for this situation. Easy to work around.

RAM

From here it’s just a matter of getting RAM and hard drive / CD-ROM. Since this system will be running virutalized desktops the more RAM the better. XT doesn’t over commit RAM so if you want to run two desktops with 4G of RAM each you’ll need a bit more than 8G of RAM since the dom0 and service VMs will need a bit too. I picked up 16G of G.SKILL Ares RAM. Generally I run 3 desktops: one Debian Squeeze for development, one Debian Wheezy for my personal stuffs and one Windows 7 for required email and GoToMeetings each with 4G of ram. This system has no problems handling all 3 at the same time.

Disks

Hard disk is always faster and bigger is better. This is where I saved money though since I’ve already got a huge (6T!) NFS server for bulk storage. I went with a modest 120G OCZ Agility 3. I haven’t done any benchmarking but it’s big enough for my root filesystems and fast enough as well. I also put in a Sony Optiarc BC-5650H-01 6X Slim Blu-Ray Reader. I got in on the cheap from an ebay seller. It’s one of those fancy slimline drives so there’s no tray to eject. I hardly ever use optical media except to rip CDs on to my NFS storage. Not sure how much I’ll use this but it’s nice even though it doesn’t match the case 🙂

Photos

I wish there were something in this last photo to give some perspective on the size of the case. Basically it’s the size of a lunchbox 🙂

XT Configuration

Once it’s all put together with XT installed it pretty much “just works”. The USB stack on XT isn’t perfect so some USB devices won’t work unless you pass the USB controller directly through to a guest using Xen’s PCI passthrough stuff. Since XT doesn’t support USB 3.0 anyways I’ve passed the 3.0 controller on the motherboard through directly to my Windows 7 guest to get my webcam working. The Plantronics headset you see in the photo works fine over the virtualized USB stack though and together they make for a pretty sweet VoIP / Video Converencing / GoToMeeting setup. So that’s it. A home build Intel i7 system with 16G of ram, an SSD and Intel TXT / measured boot running XenClient XT. Pretty solid home system by my standards.

Chrome web sandbox on XenClient

There’s lots of software out there that sets up a “sandbox” to protect your system from untrusted code. The examples that come to mind are Chrome and Adobe for the flash sandbox. The strength of these sandboxes are an interesting point of discussion. Strength is always related to the mechanism and if you’re running on Windows the separation guarantees you get are only as strong as the separation Windows affords to processes. If this is a strong enough guarantee for you then you probably won’t find this post very useful. If you’re interested in using XenClient and the Xen hypervisor to get yourself the strongest separation that I can think of, then read on!

Use Case

XenClient allows you to run any number of operating systems on a single piece of hardware. In my case this is a laptop. I’ve got two VMs: my work desktop (Windows 7) for email and other work stuff and my development system that runs Debian testing (Wheezy as of now).

Long story short, I don’t trust some of the crap that’s out on the web to run on either of these systems. I’d like to confine my web browsing to a separate VM to protect my company’s data and my development system. This article will show you how to build a bare bones Linux VM that runs a web browser (Chromium) and little more.

Setup

You’ll need a linux VM to host your web browser. I like Debian Wheezy since the PV xen drivers for network and disk work out of the box on XenClient (2.1). There’s a small bug that required you use LVM for your rootfs but I typically do that anyways so no worries there.

Typically I do an install omitting even the “standard system tools” to keep things as small as possible. This results in a root file system that’s < 1G. All you need to do then is install the web browser (chromium), rungetty, and the xinint package. Next is a bit of scripting and some minor configuration changes.

inittab

When this VM boots we want the web browser to launch and run full screen. We don’t want a window manager or anything. Just the browser.

When Linux boots, the init process parses the /etc/inittab file. One of the things specified in inittab are processes that init starts like getty. Typically inittab starts getty‘s on 6 ttys but we want it to start chrome for us. We can do this by having init execute rungetty (read the man page!) which we can then have execute arbitrary commands for us:

# /sbin/getty invocations for the runlevels.
#
# The "id" field MUST be the same as the last
# characters of the device (after "tty").
#
# Format:
#  :::
#
# Note that on most Debian systems tty7 is used by the X Window System,
# so if you want to add more getty's go ahead but skip tty7 if you run X.
#
1:2345:respawn:/sbin/getty 38400 tty1
2:23:respawn:/sbin/getty 38400 tty2
3:23:respawn:/sbin/getty 38400 tty3
4:23:respawn:/sbin/getty 38400 tty4
5:23:respawn:/sbin/getty 38400 tty5
6:23:respawn:/sbin/rungetty tty6 -u root /usr/sbin/chrome-restore.sh

Another configuration change you’ll have to make is in /etc/X11/Xwrapper.config. The default configuration in this file prevents users from starting X if their controlling TTY isn’t a virtual console. Since we’re kicking off chromium directly we need to relax this restriction:

allowed_users=anybody

chromium-restore script

Notice that we have rungetty execute a script for us and it does so as the root user. We don’t want chromium running as root but we need to do some set-up before we kick off chromium as an unprivileged user. Here’s the chrome-restore.sh script:

#!/bin/sh

USER=chromium
HOMEDIR=/home/${USER}
HOMESAFE=/usr/share/${USER}-clean
CONFIG=${HOMEDIR}/.config/chromium/Default/Preferences
LAUNCH=$(which chromium-launch.sh)
if [ ! -x "${LAUNCH}" ]; then
	echo "web-launch.sh not executable: ${LAUNCH}"
	exit 1
fi
CMD="${LAUNCH} ${CONFIG}"

rsync -avh --delete ${HOMESAFE}/ ${HOMEDIR}/ > /dev/null 2>&1
chown -R ${USER}:${USER} ${HOMEDIR}

/bin/su - -- ${USER} -l -c "STARTUP="${CMD}" startx" < /dev/null
shutdown -Ph now

The first part of this script is setting up the home directory for the user (chromium) that will be running chromium. This is the equivalent of us restoring the users home directory to a “known good state”. This means that the directory located at /usr/share/chromium-clean is a “known good” home directory for us to start from. On my system it’s basically an empty directory with chrome’s default config.

The second part of the script, well really the last two lines just runs startx as an unprivileged user. startx kicks off the X server but first we set a variable STARTUP to be the name of another script: chromium-launch.sh. When this variable is set, startx runs the command from the variable after the X server is started. This is a convenient way to kick off an X server that runs just a single graphical application.

The last command shuts down the VM. The shutdown command will only be run after the X server terminates which will happen once the chromium process terminates. This means that once the last browser tab is closed the VM will shutdown.

chromium-launch script

The chromium-launch.sh script looks like this:

#!/bin/sh

CONFIG=$1
if [ ! -f "${CONFIG}" ]; then
	echo "cannot locate CONFIG: ${CONFIG}"
	exit 1
fi

LINE=$(xrandr -q 2> /dev/null | grep Screen)
WIDTH=$(echo ${LINE} | awk '{ print $8 }')
HEIGHT=$(echo ${LINE} | awk '{ print $10 }' | tr -d ',')

sed -i -e "s&(s+"bottom":s+)-?[0-9]+&1${HEIGHT}&" ${CONFIG}
sed -i -e "s&(s+"left":s+)-?[0-9]+&10&" ${CONFIG}
sed -i -e "s&(s+"right":s+)-?[0-9]+&1${WIDTH}&" ${CONFIG}
sed -i -e "s&(s+"top":s+)-?[0-9]+&10&" ${CONFIG}
sed -i -e "s&(s+"work_area_bottom":s+)-?[0-9]+&1${HEIGHT}&" ${CONFIG}
sed -i -e "s&(s+"work_area_left":s+)-?[0-9]+&10&" ${CONFIG}
sed -i -e "s&(s+"work_area_right":s+)-?[0-9]+&1${WIDTH}&" ${CONFIG}
sed -i -e "s&(s+"work_area_top":s+)-?[0-9]+&10&" ${CONFIG}

chromium

It’s a pretty simple script. It takes one parameter which is the path to the main chromium config file. It query’s the X server through xrandr to get the screen dimensions (WIDTH and HEIGHT) which means it must be run after the X server starts. It then re-writes the relevant values in the config file to the maximum screen width and height so the browser is run “full screen”. Pretty simple stuff … once you figure out the proper order to do things and the format of the Preferences file which was non-trivial.

User Homedir

The other hard part is creating the “known good” home directory for your unprivileged user. What I did was start up chromium once manually. This causes the standard chromium configuration to be generated with default values. I then copied this off to /usr/share to be extracted on each boot.

Conclusion

So hopefully these instructions are enough to get you a Linux system that boots and runs Chromium as an unprivileged user. It should restore that users home directory to a known good state on each boot so that any downloaded data will be wiped clean. When the last browser tab is closed it will power off the system.

I use this on my XenClient XT system for browsing sites that I want to keep separate from my other VMs. It’s not perfect though and as always there is more that can be done to secure it. I’d start by making the root file system read only and adding SELinux would be fun. Also the interface is far too minimal. Finding a way to handle edge cases like making pop-ups manageable and allowing us to do things like control volume levels would also be nice. This may require configuring a minimal window manager which is a pretty daunting task. If you have any other interesting ways to make this VM more usable or lock it down better you should leave them in the comments.

openembedded yocto native hello world

NOTE: I took the time to get to the bottom of the issue discussed in this post. There’s a new post here that explains the “right way” to use Makefiles with yocto. As always, the error in this post was mine 🙂

I’ve officially “drank the Kool-Aid” and I’m convinced openembedde and Yocto are pretty awesome. I’ve had a blast building small Debian systems on PCEngines hardware in the past and while I’m waiting for my Raspberry Pi to arrive I’ve been trying to learn the ins and outs of Yocto. The added bonus is that the XenClient team at Citrix uses openembedded for our build system so this work can also fall under the heading of “professional development”.

Naturally the first task I took on was way too complicated so I made a bunch of great progress (more about that in a future post once I get it stable) but then I hit a wall that I ended up banging my head against for a full day. I posted a cry for help on the mailing list and didn’t get any responses so I set out to remove as many moving parts as possible and find the root cause.

First things first read the Yocto development manual and the Yocto reference for whatever release you’re using. This is essential because no one will help you till you’ve read and understand these 🙂

So the software I’m trying to build is built using raw Makefiles, none of that fancy autotools stuff. This can be a bit of a pain because depending on the Makefiles, it’s not uncommon for assumptions to be made about file system paths. Openembedded is all about cross compiling so it wants to build and install software under all sorts of strange roots and some Makefiles just can’t handle this. I ran into a few of these scenarios but nothing I couldn’t overcome.

Getting a package for my target architecture wasn’t bad but I did run into a nasty problem when I tried to get a native package built. From the searches I did on the interwebs it looks like there have been a number of ways to build native packages. The current “right way” is simply to have your recipe extend the native class. Thanks to XorA for documenting his/her new package workflow for that nugget.

BBCLASSEXTEND = "native"

After having this method blow up for my recipe I was tempted to hack together some crazy work around. I really want to upstream the stuff I’m working on though and I figure having crazy shit in my recipe to work around my misunderstanding of the native class was setting the whole thing up for failure. So instead I went back to basics and made a “hello world” program and recipe (included at the end of this post) hoping to recreate the error and hopefully figure out what I was doing wrong at the same time.

It took a bit of extra work but I was able to recreate the issue with a very simple Makefile. First the error message:

NOTE: package hello-native-1.0-r0: task do_populate_sysroot: Started
ERROR: Error executing a python function in /home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb:
CalledProcessError: Command 'tar -cf - -C /home/build/poky-edison-6.0/build/tmp/work/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots/i
686-linux -ps . | tar -xf - -C /home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux' returned non-zero exit status 2 with output tar: /home/build/poky-edison-6.0/build/tmp/work
/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux: Cannot chdir: No such file or directory
tar: Error is not recoverable: exiting now
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors


ERROR: The stack trace of python calls that resulted in this exception/failure was:
ERROR:   File "sstate_task_postfunc", line 10, in 
ERROR:
ERROR:   File "sstate_task_postfunc", line 4, in sstate_task_postfunc
ERROR:
ERROR:   File "sstate.bbclass", line 19, in sstate_install
ERROR:
ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 59, in copytree
ERROR:     check_output(cmd, shell=True, stderr=subprocess.STDOUT)
ERROR:
ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 121, in check_output
ERROR:     raise CalledProcessError(retcode, cmd, output=output)
ERROR:
ERROR: The code that was being executed was:
ERROR:      0006:        bb.build.exec_func(intercept, d)
ERROR:      0007:    sstate_package(shared_state, d)
ERROR:      0008:
ERROR:      0009:
ERROR:  *** 0010:sstate_task_postfunc(d)
ERROR:      0011:
ERROR: (file: 'sstate_task_postfunc', lineno: 10, function: )
ERROR:      0001:
ERROR:      0002:def sstate_task_postfunc(d):
ERROR:      0003:    shared_state = sstate_state_fromvars(d)
ERROR:  *** 0004:    sstate_install(shared_state, d)
ERROR:      0005:    for intercept in shared_state['interceptfuncs']:
ERROR:      0006:        bb.build.exec_func(intercept, d)
ERROR:      0007:    sstate_package(shared_state, d)
ERROR:      0008:
ERROR: (file: 'sstate_task_postfunc', lineno: 4, function: sstate_task_postfunc)
ERROR: Function 'sstate_task_postfunc' failed
ERROR: Logfile of failure stored in: /home/build/poky-edison-6.0/build/tmp/work/i686-linux/hello-native-1.0-r0/temp/log.do_populate_sysroot.30718
Log data follows:
| NOTE: QA checking staging
| ERROR: Error executing a python function in /home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb:
| CalledProcessError: Command 'tar -cf - -C /home/build/poky-edison-6.0/build/tmp/work/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots
/i686-linux -ps . | tar -xf - -C /home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux' returned non-zero exit status 2 with output tar: /home/build/poky-edison-6.0/build/tmp/wo
rk/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux: Cannot chdir: No such file or directory
| tar: Error is not recoverable: exiting now
| tar: This does not look like a tar archive
| tar: Exiting with failure status due to previous errors
|
|
| ERROR: The stack trace of python calls that resulted in this exception/failure was:
| ERROR:   File "sstate_task_postfunc", line 10, in 
| ERROR:
| ERROR:   File "sstate_task_postfunc", line 4, in sstate_task_postfunc
| ERROR:
| ERROR:   File "sstate.bbclass", line 19, in sstate_install
| ERROR:
| ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 59, in copytree
| ERROR:     check_output(cmd, shell=True, stderr=subprocess.STDOUT)
| ERROR:
| ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 121, in check_output
| ERROR:     raise CalledProcessError(retcode, cmd, output=output)
| ERROR:
| ERROR: The code that was being executed was:
| ERROR:      0006:        bb.build.exec_func(intercept, d)
| ERROR:      0007:    sstate_package(shared_state, d)
| ERROR:      0008:
| ERROR:      0009:
| ERROR:  *** 0010:sstate_task_postfunc(d)
| ERROR:      0011:
| ERROR: (file: 'sstate_task_postfunc', lineno: 10, function: )
| ERROR:      0001:
| ERROR:      0002:def sstate_task_postfunc(d):
| ERROR:      0003:    shared_state = sstate_state_fromvars(d)
| ERROR:  *** 0004:    sstate_install(shared_state, d)
| ERROR:      0005:    for intercept in shared_state['interceptfuncs']:
| ERROR:      0006:        bb.build.exec_func(intercept, d)
| ERROR:      0007:    sstate_package(shared_state, d)
| ERROR:      0008:
| ERROR: (file: 'sstate_task_postfunc', lineno: 4, function: sstate_task_postfunc)
| ERROR: Function 'sstate_task_postfunc' failed
NOTE: package hello-native-1.0-r0: task do_populate_sysroot: Failed
ERROR: Task 3 (virtual:native:/home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb, do_populate_sysroot) failed with exit code '1'
ERROR: 'virtual:native:/home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb' failed

So even with the most simple Makefile I could cause a native recipe build to blow up. Here’s the Makefile:

.PHONY : all clean install uninstall

PREFIX ?= $(DESTDIR)/usr
BINDIR ?= $(PREFIX)/bin

HELLO_src = hello.c
HELLO_bin = hello
HELLO_tgt = $(BINDIR)/$(HELLO_bin)

all : $(HELLO_bin)

$(HELLO_bin) : $(HELLO_src)

$(HELLO_tgt) : $(HELLO_bin)
	install -d $(BINDIR)
	install -m 0755 $^ $@

clean :
	rm $(HELLO_bin)

install : $(HELLO_tgt)

uninstall :
	rm $(BINDIR)/$(HELLO_tgt)

And here’s the relevant install method from the bitbake recipe:

do_install () {
    oe_runmake DESTDIR=${D} install
}

Notice I’m using the variable DESTDIR to tell the Makefile the root (not just /) to install things to. This should work right? It works for a regular package but not for a native one! This drove me nuts for a full day.

The solution to this problem lies in some weirdness in the Yocto native class when combined with the populate_sysroot method. The way I figured this out was by inspecting the differences in the environment when building hello vs hello-native. When building the regular package for the target architecture variables like bindir and sbindir were what I would expect them to be:

bindir="/usr/bin"
sbindir="/usr/sbin"

but when building hello-native they get a bit crazy:

bindir="/home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux/usr/bin"
sbindir="/home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux/usr/sbin"

This is a hint at the source of crazy path that staging is trying to tar up above in the error message. Further if you look in the build directory for a regular target arch package you’ll see your files where you expect in ${D}sysroot-destdir/usr/bin but for a native build you’ll see stuff in ${D}sysroot-destdir/home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux/usr/bin. Pretty crazy right? I’m sure there’s a technical reason for this but it’s beyond me.

So the way you can work around this is by telling your Makefiles about paths like bindir through the recipe. A fixed do_install would look like this:

do_install () {
    oe_runmake DESTDIR=${D} BINDIR=${D}${bindir} install
}

For more complicated Makefiles you can probably specify a PREFIX and set this equal to the ${prefix} variable but YMMV. I’ll be trying this out to keep my recipes as simple as possible.

If you want to download my example the recipe is here. This will pull down the hello world source code and build the whole thing for you.

Linux bridge forward EAPOL 8021x frames

XenClient is no different from other Xen configurations in that the networking hardware is shared between guests through a bridge hosted in dom0 (or a network driver domain in the case of XenClient XT). For most use cases the standard Linux bridge will route your traffic as expected. We ran into an interesting problem however when a customer doing a pilot on XenClient XT tried to authenticate their guest VMs using EAPOL (8021x auth over ethernet). The bridge gobbled up their packets and we got some pretty strange bug reports as a result.

Just throwing “linux bridge EAPOL 8021x” into a search engine will return a number of hits from various mailing lists where users report similar issues. The fix is literally a one line change that drops a check on the destination MAC address. This check is to ensure compliance with the 8021d standard which requires layer 2 bridges to drop packets from the “bridge filter MAC address group“. Since XenClient is a commercial product and the fix is in code that is responsible for our guest networking (which is pretty important) we wanted to code up a way to selectively enable this feature on a per-bridge basis using a sysfs node. We / I also tested the hell out of it for a few days straight.

The end result is a neat little patch that allows users to selectively pass EAPOL packets from their guests across the layer 2 bridge in dom0 / ndvm and out to their authentication infrastructure. The patch is available opensource just like the kernel and is available on the XenClient source CD. It’s also available here for your convenience 🙂

OE-Core Yocto gcc timeout

I’ve been thrashing around trying to get the upstream OE to build an image for me. Today I finally made a concerted effort over a few hours to dive deep and do this right. It turns out I was using the “old” OE repos when I should have been using the “new” build system from the Yocto Project. Their documentation is excellent but still, my first build failed.

What’s this? The GCC recipe failing because of a network timeout? Oddly enough it actually downloaded some of the sources but not all of ’em.

| svn: REPORT of '/svn/gcc/!svn/vcc/default': Could not read response body: connection was closed by server (http://gcc.gnu.org)
NOTE: package gcc-cross-initial-4.6.1+svnr175454-r10: task do_fetch: Failed
ERROR: Task 5 (/home/build/poky-edison-6.0/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb, do_fetch) failed with exit code '1'
ERROR: '/home/build/poky-edison-6.0/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb' failed

At this point I just tried again and it failed in the same place but had checked out more of the code. A quick search turns up a similar error is common when checking out code from SVN servers over HTTP. Apache just has a tendency to timeout when checking out large repositories with mod-svn. The suggested fix is to increase the timeout value in your Apache configs … except these configs are on the GNU web servers and we can’t change them.

What we can change though is the protocol bitbake uses when getting the sources. Just change the proto from ‘http’ to ‘svn’ in the SRC_URI in gcc-4.6.inc (found at /meta/recipes-devtools/gcc/gcc-4.6.inc and we’re almost good. It’ll look like this when you’re done.


SRC_URI = "svn://gcc.gnu.org/svn/gcc/branches;module=${BRANCH};proto=svn

It still timed out for me a few times but it ran for much longer than the HTTP protocol option. HTTP timed out after about 10 minutes, SVN made it almost an hour before timing out … You’d think there would be a tarball of these sources mirrored somewhere so we didn’t have to kill the GNU SVN servers on every fresh build. Something to look into I guess. Either way gcc is building now, hopefully I’ll have a build running soon …

UPDATE: With some advice from Scott below I used the poky distro by including: DISTRO="poky" in my local.conf file. As promised bitbake then doesn’t try to check out the gcc svn repository directly from gnu.org. Instead it grabs a tarball from one ob the Yocto mirrors and the build takes mere minutes. Thanks Scott!

LaTeX for your Resume / CV

I’m far from a ninja when it comes to LaTeX but I’m a big fan. I’ve written a bit about formatting logical expressions for past homework exercises. I’ve also used it in blog posts for doing the same. It’s a very useful tool even if you’re just a using basic templates like me.

A major driver behind my work on this website was the desire to get some of my technical work out into the public domain. Around the same time I started blogging I told myself that I should host my resume on this site as an incentive to keep it up to date. I failed pretty miserably there.

But when I took my position with Citrix nearly a year ago I updated my CV and now I’m resolving to keep it that way. It’s never an easy task to drag an old CV into the modern age and mine had been formatted using a very old style called res from RPI. Instead of struggling to keep the style usable on a modern toolchain I took on migrating to the newer tucv from CTAN.

This was a catalyst for all sorts of useful stuff like getting my CV into a git repo and generally refreshing the content. I’ll be putting together an ‘about’ page this site where I’ll host it and make the source available as well.

Till then here’s a quick set of instructions for getting tucv working on Debian Wheezy:
Unfortunately I wasn’t ablt to find tucv in any of the Debian latex / texlive packages. So to get tucv working I had to get the basic latex and texlive packages. Once this was done I had to download the .dtx and ,ins files manually.

Figuring out how to generate a style file from these sources and where to put them was the next trick. A bit of web searching turned up a manual describing how to use LaTeX on Debian:

  1. Just copy these files to /usr/local/share/texmf/tex/latex/tucv.
  2. Compile both files using latex
  3. to generate the package and documentation.

  4. Registering the new style using mktexlsr or texhash.

Then all you have to do is make your resume! Following the examples from the CTAN website is the best way to go. Personally I already had significant amount of content so most of my time was spent playing with layout.

It’s not perfect and I’ll be playing around to see if I can get better spacing in some of the sections that have a two column layout. The right most column is too narrow and forces date ranges on to multiple lines and I’m not a big fan of how that looks.

sVirt in XenClient

It’s been 5 months since my last post about my on-going project required by my masters program at SU. With the hope of eventually getting my degree, this is my last post on the subject. In my previous post on this topic I described a quick prototype I coded up to test an example program and SELinux policy to demonstrate the sVirt architecture. This was a simple example of how categories from the MCS policy can be used to separate multiple instances of the same program. The logical step after implementing a prototype is coding up the real thing so in this post I’ll go into some detail describing an implementation of the sVirt architecture I coded for the XenClient XT platform. While it may have taken me far too long to write up a description of this project, it’s already running in a commercial product … so I’ve got that going for me 🙂

Background

XenClient is a bit different than the upstream Xen in that the management stack has been completely rewritten. Instead of the xend process which was written in python, XenClient uses a toolstack that’s rewritten in Haskell. This posed two significant hurdles. First I’ve done little more than read the first few pages from a text book on Haskell so the sVirt code, though not complex, would be a bit over my skill level. Second SELinux has no Haskell bindings which would be required by the sVirt code.

Taking on the task of learning a new functional programming language and writing bindings for a relatively complex API in this language would have taken far longer than reasonable. Though we do intend to integrate sVirt into the toolstack proper, putting this work on the so called “critical path” would have been prohibitively expensive. Instead we implemented the sVirt code as a C program that is interposed between the toolstack and the QEMU instances it is intended to separate. Thus the toolstack (the xenmgr process) invokes the svirt-interpose program each time a QEMU process needs to be started for a VM. The svirt-interpose process then does all of the necessary functions to prepare the environment for the separation of the QEMU instance requested from the others currently running.

The remainder of this document describes the svirt-interpose program in detail. We begin by describing the interfaces down the call chain between the xenmgr, svirt-interpose and QEMU.
We then go into detail describing the internal workings of the svirt-interpose code. This includes the algorithm used to assign categories to QEMU processes and to label the system objects used by these processes. We conclude with a brief analysis of the remaining pieces of the system that could benefit from similar separation. In my first post on this topic I describe possible attacks we’re defending against so I’ll not repeat that here.

Call Chain

As we’re unable to integrate the sVirt code directly into the toolstack we must interpose the execution of the sVirt binary between the toolstack and QEMU. We do this by having the toolstack invoke the sVirt binary and then have sVirt invoke QEMU after performing the necessary SELinux operations. For simplicity we leave the command line that the toolstack would normally pass to QEMU unchanged and simply extract the small piece of information we need from it in the sVirt code. All sVirt requires to do it’s job is the domain id (domid) of the VM it’s starting a QEMU instance for. This value is the first parameter so extracting it is quite simple.

The final bit that must be implemented is in policy. Here we must be sure that the policy we write reflects this call chain explicitly. This means removing the ability for the toolstack (xend_t) to invoke QEMU (qemu_t) directly and replacing this with allowing the toolstack to execute the svirt-interpose program (svirt_t) while allowing the svirt-interpose domain to transition to the QEMU domain. This is an important part of the process as it prevents the toolstack from bypassing the svirt code. Many will find protections like this superfluous as it implies protections from a malicious toolstack and the toolstack is a central component of the systems TCB. There is a grain of truth in this argument though it represents a rather misguided analysis. It is very important to limit the permissions granted to a process to limit a possible vulnerability even if the process we’re confining is largely a “trusted” system component.

Category Selection

The central piece of this architecture is to select a unique MCS category for each QEMU process and assign this category to the resources belonging to said process. Before a category can be assigned to a resource we must first chose the category. The only requirement we have when selecting categories is that they are unique (not used by another QEMU process).
Thus there is no special meaning in a category number. Thus it makes sense to select the category number at random.

We’re presented with an interesting challenge here based on the nature of the svirt-interpose program. If this code was integrated with the toolstack directly it would be reasonable to maintain a data structure mapping the running virtual machines to their assigned categories. We could then select a random category number for a new QEMU instance and quickly walk this in-memory structure to be sure this category number hasn’t already been assigned to another process. But as was described previously, the svirt-interpose code is a short lived utility that is invoked by the toolstack and dies shortly after it starts up a QEMU process. Thus we need persistent storage to maintain this association.

The use of the XenStore is reasonable for such data and we use the key ‘selinux-mcs’ under the /local/domain/$domid node (where $domid is the domain id of a running VM) to store the value. Thus we randomly select a category and then walk the XenStore tree examining this key for each running VM. If a conflict is detected a new value is selected and the search continues. This is a very naive algorithm and we discuss ways in which it can be improved in the section on future work.

Object labeling

Once we’ve successfully interposed our svirt code between the toolstack and QEMU and implemented our category selection algorithm we have two tasks remaining. First we must enumerate the objects that belong to this QEMU instance and label them appropriately. Second we must perform the steps necessary to ensure the QEMU process will be labeled properly before we fork and exec it.

Determining the devices assigned to a VM by exploring the XenStore structures is tedious. The information we begin with is the domid of the VM we’re configuring QEMU for. From this we can examine the virtual block devices (VBDs) that belong to this VM but the structure in the VM specific configuration space rooted at /local/domain/$domid only contains information about the backend hosting the device. To find the OS objects associated with the device we need to determine the backend, then examine the configuration space for that backend.

We begin by listing the VBDs assigned to a VM by enumerating the /local/domain/$domid/device/vbd XenStore directory. This will yeild a number of paths in of the form /local/domain/$domid/device/vbd/$vbd_num where $vbd_num is the numeric id assigned to a virtual block device. VMs can be assigned any number of VBDs so we must process all VBDs listed in this directory.

From these paths representing each VBD assigned to a VM we get closer to the backing store by extracting the path to the backend of the split xen block driver. This path is contained in the key /local/domain/$domid/device/vbd/$vbd_num/backend. Once this path is extracted we check to see if the device in dom0 is writable by reading the ‘mode’ value. If the mode is ‘w’ the device is writable and we must apply the proper MCS label to it. We ignore read only VBDs as XenClient only assigns CDROMs as read only, all hard disks are allocated as read/write.

Once we’ve determined the device is writable we now need to extract the dom0 object (usually a block or loopback device file) that’s backing the device. The location of the device path in XenStore depends on the backend storage type in use. XenClient uses blktap processes to expose VHDs through device nodes in /dev and loopback devices to expose files that contain raw file systems. If a loopback device is in use the path to the device node will be stored in the XenStore key ‘loop-device’ in the corresponding VBD backend directory. Similarly if a bit more cryptic, the device node for a blktap device for a VHD will be in the XenStore key ‘params’.

Once these paths have been extracted the devices can be labeled using the SELinux API. To do so, we first need to know what the label should be. Through the SELinux API we can determine the current context for the file. We then set the MCS category calculated for the VM on this context and then change the file context to the resultant label. Important to note here is that both a sensitivity level and a category must be set on the security context. The SELinux API doesn’t shield us from the internals of the policy here and even though the MCS policy doesn’t reason about sensitivities there is a single sensitivity defined that must be assigned to every object (s0).

Assigning a category to the QEMU process is a bit different. Unlike file system objects there isn’t an objct that we can query for a label. Instead we can ask the security server to calculate the resultant label of a transition from the current process (sVirt) to the destination process (QEMU). There is an alteernative method available however and this allows us to deterine the type for the QEMU process directly. SELinux has added some native support for virtualization and one such bit was the addition of the API call ‘selinux_virtual_domain_context_path’. This function returns the path of a file in the SELinux configuration directory that contains the type to be assigned to domains used for virtualization.

Once we have this type the category calculated earlier is then applied and the full context is obtained. SELinux has a specific API call that allows the caller to request the security server apply a specific context to the process produced by the next exec performed by the calling process (setexeccon). Once this has been done successfully the sVirt process cleans up the environment (closes file descriptors etc) and execs the QEMU program passing it the unmodified command line that was provided by the toolstack.

Conclusion

Applying an MCS category to a QEMU process and its resources is fairly straight forward task. There are a few details that must be attended to to ensure that proper error handling is in place but the code is relatively short (~600 LOC) and quite easy to audit. There are some places where the QEMU processes must overlap however. XenClient is all about multiplexing shared hardware between multiple virtual machines on the same PC / Laptop. Sharing devices like the CD-ROM that is proxied to clients via QEMU requires some compromise.

As we state above the CD-ROM is read-only so an MCS category is not applied to the device itself but XenClient must ensure the accesses to the device are exclusive. This is achieved by QEMU obtaining an exclusive lock on a file in /var/lock before claiming the CD-ROM. All QEMU processes must be able to take this lock so the file must be created without any categories. This may seem like a minor detail but it’s quite tedious to implement in practice and it does represent path for data to be transmitted from one QEMU process to another. Transmission through this lock file would require collusion between QEMU processes so it’s considered a minimal threat.

Future Work

This is my last post in this series that has nearly spanned a year. I’m a bit ashamed it’s taken me this long to write up my masters project but it did end up taking on a life of its own getting me a job with Citrix on the XenClient team. There’s still a lot of work to be done and I’m hoping to continue documenting it here. Firstly I have to collect the 8 blog posts that document this work and roll them up into a single document I can submit to my adviser to satisfy my degree requirements.

In parallel I’ll be working all things XenClient hopefully learning Haskell and integrating the sVirt architecture directly into our toolstack. Having this code in the toolstack directly will have a number of benefits. Most obviously it’ll remove a few forks so VM loading will be quicker. More interestingly though it will open up the possibility of applying MCS category labeling to devices (both PCI and USB) that are assigned to VMs. The end goal, as always, is strengthening the separation between the system components that need to remain separate thus improving the security of the system.

Troubles with Ovi Store after upgrading N8 to Anna

I took some time yesterday to upgrade my beloved Nokia N8 to the new(ish) Symbian^3 Anna build and it wasn’t a smooth upgrade. Upgrading through the Ovi Suite went well enough but applications like the Ovi Store just wouldn’t work after the upgrade. The Ovi Store application would install and start but would sit on the splash screen showing the “Loading …” message. At this point I did a hard reset that did nothing.

After combing through a hand full of forum posts I ran across this one in the Nokia Europe discussion boards. Initially I had no idea what this guy was talking about. I guess I’m used to my phone being an appliance so downgrading packages seemed pretty crazy, especially since I had no idea where to obtain the packages (sis files) described in the post. A few web searches later and I ran across the “Fix Symbian” site. Pretty encouraging name all things considered. Anyways all I did was download the sis files from the post titled “S^3 QT 4.7.3 MOBILITY 1.1.3″ which downgrades a number of components in the ” Notifications Support Package Symbian3 v1.1.11120″ package. Once quick reboot and Ovi Store was back up and running.

So I guess the Anna upgrade that Nokia is shipping is broken? Pretty strange really but not something that can’t be worked around thanks to some contributions from the interwebs. Unfortunately the upgrade to Anna didn’t fix the problems I’ve had with my N8 and my wireless access point. I’ll debug this sooner or later, likely when I upgrade my home network with the ALIX boards I got in the mail last week.