Using OE to build an XT ‘Service VM’

UPDATE: I’ve deleted the build scripts git repo mentioned in this post and rolled all of my OE project build scripts into one repo. Find it here: git://github.com/flihp/oe-build-scripts.git

UPDATE #2: I’ve written a more up-to-date post on similar work here: http://twobit.us/blog/2013/11/openembedded-xen-network-driver-vm/. The data in this post should be considered out of date.

Over the past few weeks I’ve run into a few misconceptions about XenClient XT and OpenEmbedded. First is that XT is some sort of magical system that mere mortals can’t customize. Second is that building a special-purpose, super small Linux image on OpenEmbedded is an insurmountable task. This post is an attempt to dispel both of these misconceptions and maybe even motivate some fun work in the process.

Don’t get me wrong though, this isn’t a trival task and I didn’t start and end this work in one night. There’s still a bunch of work to do here. I’ll lay that out at the end though. For now, I’ve put up the build scripts I threw together last night on git hub. They’re super minimal and derived from another project. Get them here: https://github.com/flihp/transbridge-build-scripts

My goal here is to build a simple rootfs that XT can boot with a VM ‘type’ of ‘servicevm’. This is the type that the XT toolstack associates with the default ‘Network’ VM. Basically it will be a VM invisible to the user. Eventually I’d like for this example to be useful as a transparent network bridge suitable as an in-line filter or even as a ‘driver domain’. But let’s not get ahead of ourselves …

What image do I build?

The first thing that you need to chose to do when building an image with OE is what MACHINE you’re building for. XT uses Xen for virtualization so whatever platform you’re running it on will dictate the MACHINE. Since XT only runs on Intel hardware it’s pretty safe to assume you’re system is compatible with generic i586. The basic qemux86 MACHINE that’s in the oe-core layer builds for this so for these purposes it’ll suffice. This is already set up in the local.conf in my build scripts.

To build the minimal core image that’s in the oe-core layer just run my build.sh script from the root of the repository. I like to tee the output to a log file for inspection in the event of a failure:

./build.sh | tee build.log

Now you should have a bunch of new stuff in ./tmp-eglibc/deploy/images/ which includes an ext3 rootfs. The file name should be something like core-image-minimal.ext3. Copy this over to your XT dom0 and get ready to build a VM.

Make a VHD

The next thing to do is copy the ext3 image over to a VHD. From within /storage/disks create a new VHD large enough to hold the image. I’ve experimented with both core-image-basic and core-image-minimal and a 100M VHD will be large enough … yes that’s a very small rootfs. core-image-minimal is around 9M:

cd /storage/disks
vhd-util create -n transbridge.vhd -s 100

Next have tap-ctl create a new device node for the VHD:

tap-ctl create -a vhd:/storage/disks/transbridge.vhd

This will output the path to the device node created (and yeah the weird command syntax bugs me too). You can alternatively list the current blktap devices and find yours there:

tap-ctl list
1276    0    0        vhd /storage/ndvm/ndvm.vhd
1281    1    0        vhd /storage/ndvm/ndvm-swap.vhd
...

I’ve got no idea what the first number is (maybe PID of the blktap instance that’s backing the device?) but the 2nd and 3rd numbers are the major / minor for the device. The last column is the VHD file backing the device so find your VHD there and the major number then find the device in /dev/xen/blktap-2/tapdevX … mine had a major number of ‘8’ so that’s what I’ll use in this example. Then just byte copy your ext3 on to this device:

dd if=/storage/disks/transbridge.ext3 of=/dev/xen/blktap-2/tapdev8

Then you can mount your VHD in dom0 to poke around:

mount /dev/xen/blktap-2/tapdev8 /media/hdd

Where’s my kernel?

Yeah so OE doesn’t put a kernel on the rootfs for qemu machines. That’s part of why the core-image-minimal image is so damn small. QEMU doesn’t actually boot like regular hardware so you actually pass it the kernel on the command line so OE’s doing the right thing here. If you want the kernel from the OE build it’ll be in ./tmp-eglibc/deploy/images/ with the images … but it won’t boot on XT 😦

This is a kernel configuration thing. I could have spent a few days creating a new meta layer and customizing the Yocto kernel to get a ‘Xen-ified’ image but taht sounds like a lot of work. I’m happy for this to be quick and dirty for the time being so I just stole the kernel image from the XT ‘Network’ VM to see if I could get my VM booting.

You can do this too by first mounting the Network VMs rootfs. Cool thing is you don’t need to power down the Network VM to mount it’s FS in dom0! The disk is exposed to the Network VM as a read-only device so you can mount it read only in dom0:

mount /dev/xen/blktap-2/tapdev0 /media/cf

Then just copy the kernel and modules over to your new rootfs and set up some symlinks to the kernel image so it’s easy to find:

cp /media/cf/boot/vmlinuz-2.6.32.12-0.7.1 /media/hdd/boot
cp -R /media/cf/lib/modules/2.6.32.12-0.7.1 /media/hdd/lib/modules
cd /media/hdd/boot
ln -s vmlinuz-2.6.32.12-0.7.1 vmlinuz
cd /media/hdd
ln -s ./boot/vmlinuz-2.6.32.12-0.7.1 vmlinuz

You may find that there isn’t enough space on the ext3 image you copied on to the VHD. Remember that the ext3 image is only as large as the disk image created by OE. It’s size won’t be the same as the VHD you created unless you resize it to fill the full VHD. You can do so by first umount’ing the tapdev then running resize2fs on the tapdev:

umount /media/hdd
resize2fs /dev/xen/blktap-2/tapdev8

This will make the file system on the VHD expand to fill the full virtual disk. If you made your VHD large enough you’ll have enough space for the kernel and modules. Like I say above, 100M is a safe number but you can go smaller.

Finally you’ll want to be able to log into your VM. If you picked the minimal image it won’t have ssh or anything so you’ll need a getty listening on the xen console device. Add the following line to your inittab:

echo -e "nX:2345:respawn:/sbin/getty 115200 xvc0" >> /media/hdd/etc/inittab

There’s also gonna be a default getty trying to attach to ttyS0 which isn’t present. When the VM is up it will cause some messages on the console:

respawning too fast: disabled for 5 minutes

You can disable this by removing the ‘S’ entry in inittab but really the proper solution is a new image with a proper inittab for an XT service VM … I’ll get there eventually.

Make it a VM

Up till now all we’ve got is a VHD file. Without a VM to run it nothing interesting is gonna happen though so now we make one. The XT toolstack isn’t documented to the point where someone can just read the man page. But it will tell you a lot about itself if you just run it with out any parameters. Honestly I know very little about our toolstack so I’m always executing xec and grepping though the output.

After some experimentation here are the commands to create a new Linux VM from the provided template and modify it to be a para-virtualized service VM. In retrospect it may be better to use the ‘ndvm’ template but this is how I did it for better or for worse:

xec create-vm-with-template new-vm-linux

This command will output a path to the VM node in the XT configuration database. The name of the VM will also be something crazy. Ge the name from the output of xec-vm and change it to something sensible like ‘minimal’:

xec-vm --name  set name minimal

Your VM will also get a virutal CD-ROM which we don’t want so delete it and then add a disk for VHD we configured:

xec-vm --name minimal --disk 0 delete
xec-vm --name minimal add-disk
xec-vm --name minimal --disk 0 set phys-path /storage/disks/minimal.vhd

Then set all of the VM properties per the instructions provided in the XT Developer Guide:

xec-vm --name minimal --disk 0 set virt-path xvda
xec-vm --name minimal set flask-label "system_u:system_r:nilfvm_t"
xec-vm --name minimal set stubdom false
xec-vm --name minimal set hvm false
xec-vm --name minimal set qemu-dm-path ""
xec-vm --name minimal set slot -1
xec-vm --name minimal set type servicevm
xec-vm --name minimal set kernel /tmp/minimal-vmlinuz
xec-vm --name minimal set kernel-extract /vmlinuz
xec-vm --name minimal set cmd-line "root=/dev/xvda xencons=xvc0 console=xvc0 rw"

Then all that’s left is booting your new minimal VM:

xec-vm --name minimal start

You can then connect to the dom0 end of the Xen serial device to log into your VM:

screen $(xenstore-read /local/domain/$(xec-vm --name minimal get domid)/console/tty))

Next steps

This is a pretty rough set of instructions but it will produce a bootable VM on XenClient XT from a very small OpenEmbedded core-image-minimal. There’s tons of places this can be cleaned up starting with an kernel that’s specific to a Xen domU. A real OE DISTRO would be another welcome addition so various distro specific features could be added and removed more easily. If the lazywebs feel like contributing some OE skills to this effort leave me a comment.

What’s in a hash?

After the initial work on meta-measured it was very clear that configuring an MLE is great but alone it has little value. Sure tboot will measure things for you, it will even store these measurements in your TPM’s PCRs! But the “so what?” remains unanswered: there are hashes in your TPM, who cares?

Even after you’ve set-up meta-measured, launch an MLE and dumped out the contents of /sys/class/misc/tpm0/device/pcrs what have you accomplished? The whole point of meta-measured was to setup the machinery to make this easier and for the PCR values to remain unchanged across a reboot. I was surprised at how much work went into just this. But after this work, the hashes in these PCRs still had no meaning beyond being mysterious, albeit static, hashes.

I closed the meta-measured post stating my next goal was to take a stab at pre-computing some PCR values. Knowing the values that PCRs will have in your final running system allows for secrets to be protected by sealed storage at install time (which I’ve heard called ‘local attestation’ just to confuse things). Naturally the more system state involved in the sealing operation (assume this means ‘more PCRs’ for now) the better. So I had hoped to come back after a bit with the tools necessary for meta-measured to produce a manifest of as many of the tboot PCR values as possible.

Starting with PCR[17]

Naturally I started with what I knew would be the hardest PCR to calculate: the infamous PCR[17]. JPs comment on my last post pointed out some of his heroic efforts to compute PCR[17] so that was a huge help. So first things first: respect to JP for the pointer. This task would have taken me twice as long were it not for his work and the work of others on tboot-devel.

So I set out to calculate PCR[17] but I think my approach was different from those I was able to find in the public domain. The criteria I came up with for my work was:

  1. Calculate PCR[17] for system A on system B.
  2. Do the measurements myself.

So ‘rule #1’ basically says: no reliance on having a console on the running system. This is one part technical purity, one part good design as the intent is to make these tools as flexible as possible and useful in a build system. ‘Rule #2’ is all technical purity. This isn’t an exercise in recreating the algorithm that produces the value that ends up in PCR[17].

This last bit is important. The whole point is to account for the actual things (software, configuration etc) that are measured as part of bringing up a TXT MLE. Once these are identified they need to be collected (maybe even extracted from the system) if possible, and then used to calculate the final hash stored in PCR[17]. So basically, no parsing and hashing the output from ‘txt-stat’, that’s cheating 🙂 I explained this approach to a friend and was instantly accused of masochism. That’s a good sign and I guess there’s an element of that in the approach as well, if not everything I do.

As always, wrapping up one exploratory exercise in learning / brushing up on a language is always a good idea right? So I did as much of my work as possible on this in Python. Naturally I had to break this rule and use some C at the end but that’s a bit of a punchline so I don’t want to spoil that joke.

So if you’re only interested in the code I won’t bore you with any more talk about ‘goals’ and ‘design’. It’s all up on github. The python’s here: https://github.com/flihp/pcr-calc. The C is here: https://github.com/flihp/pcr-calc_c. There isn’t much in the way of documentation but I’ll get into that soon.

If you are interested in the words that accompany this work stay tuned. My next post will give a bit of a tour of the rabbit hole that is calculating PCR[17]. This will include discussion of each ‘thing’ that’s measured and what it all means. Like I said though: the end result is that precalculating PCR[17] for arbitrary platforms is a massive PITA and likely not very useful for my original purposes. After thinking on it a bit however I’m quite certain this info may be useful elsewhere but I’ll save that for discussion on follow-on work.

Measured Launch on OE core

It’s been 4 months since my last post but I’ve been working on some fun stuff. Said work has progressed to the point where it’s actually worth talking about publically so I’m crawling out from under my favorite rock and putting it “out there”.

My last few bits of writing were about some random OpenEmbedded stuff, basically outlining things I was learning while bumbling my way through the OE basics. I’ve been reading through the meta-selinux and meta-virtualization layers and they’re a great place to learn. Over the winter Holiday here I had some extra vacation time from my day job to burn so I finally got serious about a project I’ve been meaning to start for way too long.

meta-measured

Over the past year I’ve been thinking a lot about the “right way” to measure a software system. We’ve implemented a measurement architecture on XT but this has a few down sides: First a system as large as XT is very difficult to use as a teaching tool. It’s hard to explain and show someone the benefits of measuring a system when your example is large, complex and the relevant bits are spread throughout the whole system. Even our engineers who know our build system inside and out often get lost in the details. Second the code belongs to Citrix and closed source software isn’t very useful to anyone except the people selling it.

So after reading through the meta-selinux and meta-xen layers a bunch and learning a good bit about writing recipes I’ve started work on a reference image for a “measured system”. I’m keeping the recipes that make up this work in a layer I call ‘meta-measured’. For this first post on the topic of measured systems I’ll stick to discussing the basic mechanics of it’s construction. This includes some data on the supporting recipes and some of the component parts necessary for booting it. Hopefully along the way I’ll be able to justify the work by discussing the potential benefits to system security but the theory and architecture discussions will be left for a later post.

get the source

If you’re interested in just building it and playing with the live image this is where you should start. Take a look and let me know what you think. Feedback would be much appreciated.

All of the work I’ve done to get this first bootable image working is up on my github. You can get there, from here: https://github.com/flihp. The ‘meta-measured’ layer is here: https://github.com/flihp/meta-measured.git. To automate setting up a build environment for this I’ve got another repo with a few scripts to checkout the necessary supporting software (bitbake / OE / meta-intel etc), a local.conf (which you may need to modify for your environment), and a script to build the ‘iso’ that can be written to a USB drive for booting a test system: https://github.com/flihp/measured-build-scripts.

The best way to build this currently is to checkout the measured-build-scripts repo:

git clone git://github.com/flihp/measured-build-scripts.git

run the ‘fetch.sh’ script to populate the required git submodules and to clone the meta-measured layer:

cd measured-build-scripts
./fetch.sh

build the iso

If you try to run the ./build.sh script next as you would think you should, the build will fail currently. It will do so while attempting to download the SINIT / ACM module for TXT / tboot because Intel hides the ACMs behind a legal terms wall with terms that must be accepted before the files can be downloaded. I’ve put the direct link to it in the recipe but the download fails unless you’ve got the right cookie in your browser so wget blows up. Download it yourself from here: http://software.intel.com/en-us/articles/intel-trusted-execution-technology, then drop the zip into your ‘download’ directory manually. I’ve got the local.conf with DL_DIR hardwired to /mnt/openembedded/downloads so you’ll likely want to change this to suit your environment.

Anyway I’ll sort out a way to fool the Intel lawyer wall eventually … I’m tempted to mirror these files since the legal notice seems to allow this but I don’t really have the bandwidth ATM. Once you’ve got this sorted, run the build.sh script. I typically tee the output to a file for debugging … this is some very ‘pre-alpha’ stuff so you should expect to debug the build a bit 🙂

./build.sh | tee build.log

This will build a few images from the measured-image-bootimg recipe (tarballs, cpios, and an iso). The local.conf I’ve got in my build directory is specific to my test hardware so if you’ve got an Intel SugarBay system to test on then you can dump the ISO directly to a USB stick and boot it. If you don’t have a SugarBay system then you’ll have to do some work to get it booting since this measured boot stuff is closely tied to the hardware, though the ACMs I’ve packaged work for 2nd and 3rd gen i5 and i7 hardware (Sandy and Ivy Bridge).

recipes

I’ve organized the recipes that make up this work into two categories: Those that are specific to the TPM and those that are specific to TXT / tboot. Each of these two technologies requires some kernel configs so those are separated out into fragments like I’ve found in other layers. My test hardware has USB 3.0 ports which the base OE layers don’t seem to have yet. I’ve included this config in my oe-measured distro just so I can use the ports on the front of my test system.

The TPM recipes automate building the Trousers daemon, libtspi and some user space tools that consume the TSS interface. Recipes for the TPM software are pretty straight forward as most are autotools projects. Some work was required to get the trousers project separated into packages for the daemon and library.

The tboot recipes were a bit more work because tboot packages a bunch of utilites in the main tboot source tree so they had to be separated out into different packages (this work is still on-going). Further tboot doesn’t use autotools and they squash most compiler flags that the OE environment passes in. The compler flags required by tboot are static which stands at odds with OE and a cross-compiled environment that wants to change the path to everything including the compiler.

I’ve no clue if tboot will build properly on anything other than an Intel system. Further the issue of Intel hiding the ACMs required for their chipssets behind an EULA wall is annoying as the default OE fetcher won’t work.

images

My first instinct is always to to describe a system by construction: from the bottom up. In this case I think going top-down is a better approach so we’ll start with the rootfs and work backwards. The TPM recipes includes two images based on the core-image from OE core. That’s one initramfs image and one rootfs. The rootfs is just the core-image with the TPM kernel drivers, trousers daemon, tpm-tools and the tpm-quote-tools. I haven’t done much with this rootfs other than booting it up and see if TXT and the TPM works as expected.

There’s also an initramfs with the TPM kernel drivers, trousers daemon and the tpm-tools but not the quote tools. This is a very minimal initramfs with the TSS daemon loaded manually in the initrd script. It’s not expected that users will be using the tpm-tools interactively here but that’s what I’ve been doing for initial testing. Only the tpm_extendpcr tool (open source from Citrix) is used to extend a PCR with the sha1sum hash of the rootfs before the call to switch_root. This requires that the ‘coreutils’ package be included just for the one utility which bloats the initramfs unfortunately. Slimming this down should’t be too much work in the future. Anyway I think this is ‘the right way’ to extend the measurement chain from the initramfs up to the rootfs of the system.

The rest of the measruements we care about are taken care of by the components from the TXT recipes. There’s only one image in the TXT recipe group however. This is derived from the OE core live image and it’s intended to be ‘deployable’ in the lanugage of OE recipes. I think this means an hddimg or an ISO image, basically something you can ‘dd’ to disk and boot. Currently it’s the basis for a live image but could easily be used for something like an installer simply by switching out the rootfs.

This image is not a separate root filesystem but instead it’s an image created with the files necessary to boot the system: syslinux (configured with the mboot.c32 comboot module), tboot, acms and the initrd and the rootfs from the TPM recipes. tboot measures the bootloader config, all of the boot modules and a bunch of other stuff (see the README in the tboot sources for details). It stores these measurements in the TPM for us, creating the ‘dynamic root of trust for measurement’ (DRTM).

Once tboot has measured all of the modules, the initramfs takes over. The initramfs then measures the rootfs as described above before the switch to root. I’ve added a few kernel parameters to pass the name of the rootfs and the PCR where it’s measurement is to be stored.

If the rootfs is measured on each boot it must be mounted read-only to prevent its measurement from changing … yup even mounting a journaled file system read-write will modify the journal and change the filesystem. Creating a read-only image is a bit of work so for this first prototype I’ve used a bit of a short cut: I’ve mounted the rootfs read only, create a ramfs read write, then the two are combined in a unionfs. In this configuration when rootfs boots it looks like a read / write mount. Thus on each boot the measurements in the TPM are the same.

Next Steps

Measuring a system is all well and good but who cares? Measurements are only useful when they’re communicated to external parties. For now this image only takes measurements and these measurements are the same on each boot. That’s it. Where this can be most immediately useful is that these measurements can be predicted in the build.

The PCRs 0-7 are reserved for the BIOs and we have no way of predicting these values currently as they’re unique to the platform and that’s messy. The tboot PCRs however (17, 18 and 19 in the Legacy mapping currently used) can be calculated based on the hashing done by tboot (read their docs and http://www.mail-archive.com/tboot-devel@lists.sourceforge.net/msg00069.html). The PCR value containing the measurement of the rootfs can be calculated quite simply as well.

For a reference live image this is interesting only in an academic capacity. As I suggest above, this image can be used as a template for something like an installer which would give the predictability of PCR values much deeper meaning: Consider an installer architecture where the installer itself is a very small rootfs that downloads the install package from a remote server (basically Debian’s netboot iso or a PXE boot setup). Assuming we have a method for exchanging system measurements (more future work) it would be very useful for the remote server to be able to evaluate measurements from the installer before releasing the install package.

This is probably a good place to wrap up this post. The meta-measured layer I’ve described is still very new and the images I’ve built are still usefuly only for ‘tire-kicking’. My next post will hopefully discuss predicting measurement values in the build system and other fun stuffs.

Openembedded Yocto Native Hello World: Take 2

A while back I wrote about some problems I was having with native OpenEmbedded recipes that were building packages with raw Makefiles (no autotools). I wrote up the problem and the work around I was using here. I got some feedback pointing out what I was doing wrong but I guess my brain just didn’t process it. I was using uppercase variables while the GNU Make docs specifically call for lowercase! These variables get magically passed to Make likely through the environment and then everything just works.

So here’s my retraction: I was wrong 🙂 The ‘hello world’ Makefile should look like this:

.PHONY : clean uninstall

prefix ?= /usr
exec_prefix ?= $(prefix)
bindir ?= $(exec_prefix)/bin

HELLO_src = hello.c
HELLO_bin = hello
HELLO_tgt = $(DESTDIR)$(bindir)/$(HELLO_bin)

all : $(HELLO_bin)

$(HELLO_bin) : $(HELLO_src)

$(HELLO_tgt) : $(HELLO_bin)
	install -d $(DESTDIR)$(bindir)
	install -m 0755 $^ $@

clean :
	rm $(HELLO_bin)

install : $(HELLO_tgt)

uninstall :
	rm $(HELLO_TGT)

You can download the recipe here: hello_1.1.bb.

TXT Capable Desktop Virtualization System

Having worked on XenClient XT for the past year I’ve experienced the pain of debugging vendors TXT implementations first hand. TXT may be a nearly 6 year old technology but it’s just now coming into use and many vendors platforms have only received internal testing. We’ve found a number of ways for platforms to fail in strange ways and we’ve had to work with the vendors to get their implementations working for a system like XT that uses tboot as part of our measured launch.

For development Citrix has provided me with a number of systems but I’ve been meaning to put one together for myself for some time now. I’ve always liked building my own so I wasn’t thrilled with the prospect of purchasing a Dell / HP system. Home builds are always a bit cooler, a bit cheaper, and more fun in general. That said I was a bit worried about being able to find a motherboard / CPU combo with full AND WORKING VT-x, VT-d and TXT. It wasn’t as bad as I expected. So the following is a breakdown of the home build system I put together specifically to run XT.

Case

I always start building systems with the case. This will dictate size which in turn limits your choices for motherboards. I’ve had a string of successes building systems in Lian Li cases so again they were my first choice. I wanted this system to be as small as possible. Lian Li happens to make probably the best mini-ITX case on the market: the PC-Q02A. This case is tiny and it comes bundled with a 300W power supply. No room in the back of the case for PCI cards either so if you buy this don’t expect to throw a graphics card in it. Whatever you need has to be on the motherboard!

CPU

Since I intend to run XT on this system the CPU has to support the full Intel vPro suite including TXT. This limited me to high-end intel i5 and i7 processors. Since this system will be in a small, low power case I wanted a 65W CPU and went with the Intel i7-2600S. CPUs aren’t really where you want to save money on your build so I didn’t skimp here.

Motherboard

The motherboard is really where vPro and TXT are either made or broken. The BIOS is where CPU features are either enabled or disabled and many motherboard vendors don’t list anything in their docs about TXT compatibility. This is mostly because home users typically don’t really care. In this case we do so some research is required. I played it safe and went with an Intel DQ67EP. TXT and the TPM worked flawlessly. One thing that was a deviation from the platforms from Dell and HP I’ve played with was the TPM came without an EK populated. It’s a simple case of running tpm_createek on the system but because all of the vendor platforms come with an EK pre-populated the XT code doesn’t account for this situation. Easy to work around.

RAM

From here it’s just a matter of getting RAM and hard drive / CD-ROM. Since this system will be running virutalized desktops the more RAM the better. XT doesn’t over commit RAM so if you want to run two desktops with 4G of RAM each you’ll need a bit more than 8G of RAM since the dom0 and service VMs will need a bit too. I picked up 16G of G.SKILL Ares RAM. Generally I run 3 desktops: one Debian Squeeze for development, one Debian Wheezy for my personal stuffs and one Windows 7 for required email and GoToMeetings each with 4G of ram. This system has no problems handling all 3 at the same time.

Disks

Hard disk is always faster and bigger is better. This is where I saved money though since I’ve already got a huge (6T!) NFS server for bulk storage. I went with a modest 120G OCZ Agility 3. I haven’t done any benchmarking but it’s big enough for my root filesystems and fast enough as well. I also put in a Sony Optiarc BC-5650H-01 6X Slim Blu-Ray Reader. I got in on the cheap from an ebay seller. It’s one of those fancy slimline drives so there’s no tray to eject. I hardly ever use optical media except to rip CDs on to my NFS storage. Not sure how much I’ll use this but it’s nice even though it doesn’t match the case 🙂

Photos

I wish there were something in this last photo to give some perspective on the size of the case. Basically it’s the size of a lunchbox 🙂

XT Configuration

Once it’s all put together with XT installed it pretty much “just works”. The USB stack on XT isn’t perfect so some USB devices won’t work unless you pass the USB controller directly through to a guest using Xen’s PCI passthrough stuff. Since XT doesn’t support USB 3.0 anyways I’ve passed the 3.0 controller on the motherboard through directly to my Windows 7 guest to get my webcam working. The Plantronics headset you see in the photo works fine over the virtualized USB stack though and together they make for a pretty sweet VoIP / Video Converencing / GoToMeeting setup. So that’s it. A home build Intel i7 system with 16G of ram, an SSD and Intel TXT / measured boot running XenClient XT. Pretty solid home system by my standards.

Chrome web sandbox on XenClient

There’s lots of software out there that sets up a “sandbox” to protect your system from untrusted code. The examples that come to mind are Chrome and Adobe for the flash sandbox. The strength of these sandboxes are an interesting point of discussion. Strength is always related to the mechanism and if you’re running on Windows the separation guarantees you get are only as strong as the separation Windows affords to processes. If this is a strong enough guarantee for you then you probably won’t find this post very useful. If you’re interested in using XenClient and the Xen hypervisor to get yourself the strongest separation that I can think of, then read on!

Use Case

XenClient allows you to run any number of operating systems on a single piece of hardware. In my case this is a laptop. I’ve got two VMs: my work desktop (Windows 7) for email and other work stuff and my development system that runs Debian testing (Wheezy as of now).

Long story short, I don’t trust some of the crap that’s out on the web to run on either of these systems. I’d like to confine my web browsing to a separate VM to protect my company’s data and my development system. This article will show you how to build a bare bones Linux VM that runs a web browser (Chromium) and little more.

Setup

You’ll need a linux VM to host your web browser. I like Debian Wheezy since the PV xen drivers for network and disk work out of the box on XenClient (2.1). There’s a small bug that required you use LVM for your rootfs but I typically do that anyways so no worries there.

Typically I do an install omitting even the “standard system tools” to keep things as small as possible. This results in a root file system that’s < 1G. All you need to do then is install the web browser (chromium), rungetty, and the xinint package. Next is a bit of scripting and some minor configuration changes.

inittab

When this VM boots we want the web browser to launch and run full screen. We don’t want a window manager or anything. Just the browser.

When Linux boots, the init process parses the /etc/inittab file. One of the things specified in inittab are processes that init starts like getty. Typically inittab starts getty‘s on 6 ttys but we want it to start chrome for us. We can do this by having init execute rungetty (read the man page!) which we can then have execute arbitrary commands for us:

# /sbin/getty invocations for the runlevels.
#
# The "id" field MUST be the same as the last
# characters of the device (after "tty").
#
# Format:
#  :::
#
# Note that on most Debian systems tty7 is used by the X Window System,
# so if you want to add more getty's go ahead but skip tty7 if you run X.
#
1:2345:respawn:/sbin/getty 38400 tty1
2:23:respawn:/sbin/getty 38400 tty2
3:23:respawn:/sbin/getty 38400 tty3
4:23:respawn:/sbin/getty 38400 tty4
5:23:respawn:/sbin/getty 38400 tty5
6:23:respawn:/sbin/rungetty tty6 -u root /usr/sbin/chrome-restore.sh

Another configuration change you’ll have to make is in /etc/X11/Xwrapper.config. The default configuration in this file prevents users from starting X if their controlling TTY isn’t a virtual console. Since we’re kicking off chromium directly we need to relax this restriction:

allowed_users=anybody

chromium-restore script

Notice that we have rungetty execute a script for us and it does so as the root user. We don’t want chromium running as root but we need to do some set-up before we kick off chromium as an unprivileged user. Here’s the chrome-restore.sh script:

#!/bin/sh

USER=chromium
HOMEDIR=/home/${USER}
HOMESAFE=/usr/share/${USER}-clean
CONFIG=${HOMEDIR}/.config/chromium/Default/Preferences
LAUNCH=$(which chromium-launch.sh)
if [ ! -x "${LAUNCH}" ]; then
	echo "web-launch.sh not executable: ${LAUNCH}"
	exit 1
fi
CMD="${LAUNCH} ${CONFIG}"

rsync -avh --delete ${HOMESAFE}/ ${HOMEDIR}/ > /dev/null 2>&1
chown -R ${USER}:${USER} ${HOMEDIR}

/bin/su - -- ${USER} -l -c "STARTUP="${CMD}" startx" < /dev/null
shutdown -Ph now

The first part of this script is setting up the home directory for the user (chromium) that will be running chromium. This is the equivalent of us restoring the users home directory to a “known good state”. This means that the directory located at /usr/share/chromium-clean is a “known good” home directory for us to start from. On my system it’s basically an empty directory with chrome’s default config.

The second part of the script, well really the last two lines just runs startx as an unprivileged user. startx kicks off the X server but first we set a variable STARTUP to be the name of another script: chromium-launch.sh. When this variable is set, startx runs the command from the variable after the X server is started. This is a convenient way to kick off an X server that runs just a single graphical application.

The last command shuts down the VM. The shutdown command will only be run after the X server terminates which will happen once the chromium process terminates. This means that once the last browser tab is closed the VM will shutdown.

chromium-launch script

The chromium-launch.sh script looks like this:

#!/bin/sh

CONFIG=$1
if [ ! -f "${CONFIG}" ]; then
	echo "cannot locate CONFIG: ${CONFIG}"
	exit 1
fi

LINE=$(xrandr -q 2> /dev/null | grep Screen)
WIDTH=$(echo ${LINE} | awk '{ print $8 }')
HEIGHT=$(echo ${LINE} | awk '{ print $10 }' | tr -d ',')

sed -i -e "s&(s+"bottom":s+)-?[0-9]+&1${HEIGHT}&" ${CONFIG}
sed -i -e "s&(s+"left":s+)-?[0-9]+&10&" ${CONFIG}
sed -i -e "s&(s+"right":s+)-?[0-9]+&1${WIDTH}&" ${CONFIG}
sed -i -e "s&(s+"top":s+)-?[0-9]+&10&" ${CONFIG}
sed -i -e "s&(s+"work_area_bottom":s+)-?[0-9]+&1${HEIGHT}&" ${CONFIG}
sed -i -e "s&(s+"work_area_left":s+)-?[0-9]+&10&" ${CONFIG}
sed -i -e "s&(s+"work_area_right":s+)-?[0-9]+&1${WIDTH}&" ${CONFIG}
sed -i -e "s&(s+"work_area_top":s+)-?[0-9]+&10&" ${CONFIG}

chromium

It’s a pretty simple script. It takes one parameter which is the path to the main chromium config file. It query’s the X server through xrandr to get the screen dimensions (WIDTH and HEIGHT) which means it must be run after the X server starts. It then re-writes the relevant values in the config file to the maximum screen width and height so the browser is run “full screen”. Pretty simple stuff … once you figure out the proper order to do things and the format of the Preferences file which was non-trivial.

User Homedir

The other hard part is creating the “known good” home directory for your unprivileged user. What I did was start up chromium once manually. This causes the standard chromium configuration to be generated with default values. I then copied this off to /usr/share to be extracted on each boot.

Conclusion

So hopefully these instructions are enough to get you a Linux system that boots and runs Chromium as an unprivileged user. It should restore that users home directory to a known good state on each boot so that any downloaded data will be wiped clean. When the last browser tab is closed it will power off the system.

I use this on my XenClient XT system for browsing sites that I want to keep separate from my other VMs. It’s not perfect though and as always there is more that can be done to secure it. I’d start by making the root file system read only and adding SELinux would be fun. Also the interface is far too minimal. Finding a way to handle edge cases like making pop-ups manageable and allowing us to do things like control volume levels would also be nice. This may require configuring a minimal window manager which is a pretty daunting task. If you have any other interesting ways to make this VM more usable or lock it down better you should leave them in the comments.

openembedded yocto native hello world

NOTE: I took the time to get to the bottom of the issue discussed in this post. There’s a new post here that explains the “right way” to use Makefiles with yocto. As always, the error in this post was mine 🙂

I’ve officially “drank the Kool-Aid” and I’m convinced openembedde and Yocto are pretty awesome. I’ve had a blast building small Debian systems on PCEngines hardware in the past and while I’m waiting for my Raspberry Pi to arrive I’ve been trying to learn the ins and outs of Yocto. The added bonus is that the XenClient team at Citrix uses openembedded for our build system so this work can also fall under the heading of “professional development”.

Naturally the first task I took on was way too complicated so I made a bunch of great progress (more about that in a future post once I get it stable) but then I hit a wall that I ended up banging my head against for a full day. I posted a cry for help on the mailing list and didn’t get any responses so I set out to remove as many moving parts as possible and find the root cause.

First things first read the Yocto development manual and the Yocto reference for whatever release you’re using. This is essential because no one will help you till you’ve read and understand these 🙂

So the software I’m trying to build is built using raw Makefiles, none of that fancy autotools stuff. This can be a bit of a pain because depending on the Makefiles, it’s not uncommon for assumptions to be made about file system paths. Openembedded is all about cross compiling so it wants to build and install software under all sorts of strange roots and some Makefiles just can’t handle this. I ran into a few of these scenarios but nothing I couldn’t overcome.

Getting a package for my target architecture wasn’t bad but I did run into a nasty problem when I tried to get a native package built. From the searches I did on the interwebs it looks like there have been a number of ways to build native packages. The current “right way” is simply to have your recipe extend the native class. Thanks to XorA for documenting his/her new package workflow for that nugget.

BBCLASSEXTEND = "native"

After having this method blow up for my recipe I was tempted to hack together some crazy work around. I really want to upstream the stuff I’m working on though and I figure having crazy shit in my recipe to work around my misunderstanding of the native class was setting the whole thing up for failure. So instead I went back to basics and made a “hello world” program and recipe (included at the end of this post) hoping to recreate the error and hopefully figure out what I was doing wrong at the same time.

It took a bit of extra work but I was able to recreate the issue with a very simple Makefile. First the error message:

NOTE: package hello-native-1.0-r0: task do_populate_sysroot: Started
ERROR: Error executing a python function in /home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb:
CalledProcessError: Command 'tar -cf - -C /home/build/poky-edison-6.0/build/tmp/work/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots/i
686-linux -ps . | tar -xf - -C /home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux' returned non-zero exit status 2 with output tar: /home/build/poky-edison-6.0/build/tmp/work
/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux: Cannot chdir: No such file or directory
tar: Error is not recoverable: exiting now
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors


ERROR: The stack trace of python calls that resulted in this exception/failure was:
ERROR:   File "sstate_task_postfunc", line 10, in 
ERROR:
ERROR:   File "sstate_task_postfunc", line 4, in sstate_task_postfunc
ERROR:
ERROR:   File "sstate.bbclass", line 19, in sstate_install
ERROR:
ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 59, in copytree
ERROR:     check_output(cmd, shell=True, stderr=subprocess.STDOUT)
ERROR:
ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 121, in check_output
ERROR:     raise CalledProcessError(retcode, cmd, output=output)
ERROR:
ERROR: The code that was being executed was:
ERROR:      0006:        bb.build.exec_func(intercept, d)
ERROR:      0007:    sstate_package(shared_state, d)
ERROR:      0008:
ERROR:      0009:
ERROR:  *** 0010:sstate_task_postfunc(d)
ERROR:      0011:
ERROR: (file: 'sstate_task_postfunc', lineno: 10, function: )
ERROR:      0001:
ERROR:      0002:def sstate_task_postfunc(d):
ERROR:      0003:    shared_state = sstate_state_fromvars(d)
ERROR:  *** 0004:    sstate_install(shared_state, d)
ERROR:      0005:    for intercept in shared_state['interceptfuncs']:
ERROR:      0006:        bb.build.exec_func(intercept, d)
ERROR:      0007:    sstate_package(shared_state, d)
ERROR:      0008:
ERROR: (file: 'sstate_task_postfunc', lineno: 4, function: sstate_task_postfunc)
ERROR: Function 'sstate_task_postfunc' failed
ERROR: Logfile of failure stored in: /home/build/poky-edison-6.0/build/tmp/work/i686-linux/hello-native-1.0-r0/temp/log.do_populate_sysroot.30718
Log data follows:
| NOTE: QA checking staging
| ERROR: Error executing a python function in /home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb:
| CalledProcessError: Command 'tar -cf - -C /home/build/poky-edison-6.0/build/tmp/work/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots
/i686-linux -ps . | tar -xf - -C /home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux' returned non-zero exit status 2 with output tar: /home/build/poky-edison-6.0/build/tmp/wo
rk/i686-linux/hello-native-1.0-r0/sysroot-destdir///home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux: Cannot chdir: No such file or directory
| tar: Error is not recoverable: exiting now
| tar: This does not look like a tar archive
| tar: Exiting with failure status due to previous errors
|
|
| ERROR: The stack trace of python calls that resulted in this exception/failure was:
| ERROR:   File "sstate_task_postfunc", line 10, in 
| ERROR:
| ERROR:   File "sstate_task_postfunc", line 4, in sstate_task_postfunc
| ERROR:
| ERROR:   File "sstate.bbclass", line 19, in sstate_install
| ERROR:
| ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 59, in copytree
| ERROR:     check_output(cmd, shell=True, stderr=subprocess.STDOUT)
| ERROR:
| ERROR:   File "/home/build/poky-edison-6.0/meta/lib/oe/path.py", line 121, in check_output
| ERROR:     raise CalledProcessError(retcode, cmd, output=output)
| ERROR:
| ERROR: The code that was being executed was:
| ERROR:      0006:        bb.build.exec_func(intercept, d)
| ERROR:      0007:    sstate_package(shared_state, d)
| ERROR:      0008:
| ERROR:      0009:
| ERROR:  *** 0010:sstate_task_postfunc(d)
| ERROR:      0011:
| ERROR: (file: 'sstate_task_postfunc', lineno: 10, function: )
| ERROR:      0001:
| ERROR:      0002:def sstate_task_postfunc(d):
| ERROR:      0003:    shared_state = sstate_state_fromvars(d)
| ERROR:  *** 0004:    sstate_install(shared_state, d)
| ERROR:      0005:    for intercept in shared_state['interceptfuncs']:
| ERROR:      0006:        bb.build.exec_func(intercept, d)
| ERROR:      0007:    sstate_package(shared_state, d)
| ERROR:      0008:
| ERROR: (file: 'sstate_task_postfunc', lineno: 4, function: sstate_task_postfunc)
| ERROR: Function 'sstate_task_postfunc' failed
NOTE: package hello-native-1.0-r0: task do_populate_sysroot: Failed
ERROR: Task 3 (virtual:native:/home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb, do_populate_sysroot) failed with exit code '1'
ERROR: 'virtual:native:/home/build/poky-edison-6.0/meta-test/recipes-test/helloworld/hello_1.0.bb' failed

So even with the most simple Makefile I could cause a native recipe build to blow up. Here’s the Makefile:

.PHONY : all clean install uninstall

PREFIX ?= $(DESTDIR)/usr
BINDIR ?= $(PREFIX)/bin

HELLO_src = hello.c
HELLO_bin = hello
HELLO_tgt = $(BINDIR)/$(HELLO_bin)

all : $(HELLO_bin)

$(HELLO_bin) : $(HELLO_src)

$(HELLO_tgt) : $(HELLO_bin)
	install -d $(BINDIR)
	install -m 0755 $^ $@

clean :
	rm $(HELLO_bin)

install : $(HELLO_tgt)

uninstall :
	rm $(BINDIR)/$(HELLO_tgt)

And here’s the relevant install method from the bitbake recipe:

do_install () {
    oe_runmake DESTDIR=${D} install
}

Notice I’m using the variable DESTDIR to tell the Makefile the root (not just /) to install things to. This should work right? It works for a regular package but not for a native one! This drove me nuts for a full day.

The solution to this problem lies in some weirdness in the Yocto native class when combined with the populate_sysroot method. The way I figured this out was by inspecting the differences in the environment when building hello vs hello-native. When building the regular package for the target architecture variables like bindir and sbindir were what I would expect them to be:

bindir="/usr/bin"
sbindir="/usr/sbin"

but when building hello-native they get a bit crazy:

bindir="/home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux/usr/bin"
sbindir="/home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux/usr/sbin"

This is a hint at the source of crazy path that staging is trying to tar up above in the error message. Further if you look in the build directory for a regular target arch package you’ll see your files where you expect in ${D}sysroot-destdir/usr/bin but for a native build you’ll see stuff in ${D}sysroot-destdir/home/build/poky-edison-6.0/build/tmp/sysroots/i686-linux/usr/bin. Pretty crazy right? I’m sure there’s a technical reason for this but it’s beyond me.

So the way you can work around this is by telling your Makefiles about paths like bindir through the recipe. A fixed do_install would look like this:

do_install () {
    oe_runmake DESTDIR=${D} BINDIR=${D}${bindir} install
}

For more complicated Makefiles you can probably specify a PREFIX and set this equal to the ${prefix} variable but YMMV. I’ll be trying this out to keep my recipes as simple as possible.

If you want to download my example the recipe is here. This will pull down the hello world source code and build the whole thing for you.

Linux bridge forward EAPOL 8021x frames

XenClient is no different from other Xen configurations in that the networking hardware is shared between guests through a bridge hosted in dom0 (or a network driver domain in the case of XenClient XT). For most use cases the standard Linux bridge will route your traffic as expected. We ran into an interesting problem however when a customer doing a pilot on XenClient XT tried to authenticate their guest VMs using EAPOL (8021x auth over ethernet). The bridge gobbled up their packets and we got some pretty strange bug reports as a result.

Just throwing “linux bridge EAPOL 8021x” into a search engine will return a number of hits from various mailing lists where users report similar issues. The fix is literally a one line change that drops a check on the destination MAC address. This check is to ensure compliance with the 8021d standard which requires layer 2 bridges to drop packets from the “bridge filter MAC address group“. Since XenClient is a commercial product and the fix is in code that is responsible for our guest networking (which is pretty important) we wanted to code up a way to selectively enable this feature on a per-bridge basis using a sysfs node. We / I also tested the hell out of it for a few days straight.

The end result is a neat little patch that allows users to selectively pass EAPOL packets from their guests across the layer 2 bridge in dom0 / ndvm and out to their authentication infrastructure. The patch is available opensource just like the kernel and is available on the XenClient source CD. It’s also available here for your convenience 🙂

OE-Core Yocto gcc timeout

I’ve been thrashing around trying to get the upstream OE to build an image for me. Today I finally made a concerted effort over a few hours to dive deep and do this right. It turns out I was using the “old” OE repos when I should have been using the “new” build system from the Yocto Project. Their documentation is excellent but still, my first build failed.

What’s this? The GCC recipe failing because of a network timeout? Oddly enough it actually downloaded some of the sources but not all of ’em.

| svn: REPORT of '/svn/gcc/!svn/vcc/default': Could not read response body: connection was closed by server (http://gcc.gnu.org)
NOTE: package gcc-cross-initial-4.6.1+svnr175454-r10: task do_fetch: Failed
ERROR: Task 5 (/home/build/poky-edison-6.0/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb, do_fetch) failed with exit code '1'
ERROR: '/home/build/poky-edison-6.0/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb' failed

At this point I just tried again and it failed in the same place but had checked out more of the code. A quick search turns up a similar error is common when checking out code from SVN servers over HTTP. Apache just has a tendency to timeout when checking out large repositories with mod-svn. The suggested fix is to increase the timeout value in your Apache configs … except these configs are on the GNU web servers and we can’t change them.

What we can change though is the protocol bitbake uses when getting the sources. Just change the proto from ‘http’ to ‘svn’ in the SRC_URI in gcc-4.6.inc (found at /meta/recipes-devtools/gcc/gcc-4.6.inc and we’re almost good. It’ll look like this when you’re done.


SRC_URI = "svn://gcc.gnu.org/svn/gcc/branches;module=${BRANCH};proto=svn

It still timed out for me a few times but it ran for much longer than the HTTP protocol option. HTTP timed out after about 10 minutes, SVN made it almost an hour before timing out … You’d think there would be a tarball of these sources mirrored somewhere so we didn’t have to kill the GNU SVN servers on every fresh build. Something to look into I guess. Either way gcc is building now, hopefully I’ll have a build running soon …

UPDATE: With some advice from Scott below I used the poky distro by including: DISTRO="poky" in my local.conf file. As promised bitbake then doesn’t try to check out the gcc svn repository directly from gnu.org. Instead it grabs a tarball from one ob the Yocto mirrors and the build takes mere minutes. Thanks Scott!

LaTeX for your Resume / CV

I’m far from a ninja when it comes to LaTeX but I’m a big fan. I’ve written a bit about formatting logical expressions for past homework exercises. I’ve also used it in blog posts for doing the same. It’s a very useful tool even if you’re just a using basic templates like me.

A major driver behind my work on this website was the desire to get some of my technical work out into the public domain. Around the same time I started blogging I told myself that I should host my resume on this site as an incentive to keep it up to date. I failed pretty miserably there.

But when I took my position with Citrix nearly a year ago I updated my CV and now I’m resolving to keep it that way. It’s never an easy task to drag an old CV into the modern age and mine had been formatted using a very old style called res from RPI. Instead of struggling to keep the style usable on a modern toolchain I took on migrating to the newer tucv from CTAN.

This was a catalyst for all sorts of useful stuff like getting my CV into a git repo and generally refreshing the content. I’ll be putting together an ‘about’ page this site where I’ll host it and make the source available as well.

Till then here’s a quick set of instructions for getting tucv working on Debian Wheezy:
Unfortunately I wasn’t ablt to find tucv in any of the Debian latex / texlive packages. So to get tucv working I had to get the basic latex and texlive packages. Once this was done I had to download the .dtx and ,ins files manually.

Figuring out how to generate a style file from these sources and where to put them was the next trick. A bit of web searching turned up a manual describing how to use LaTeX on Debian:

  1. Just copy these files to /usr/local/share/texmf/tex/latex/tucv.
  2. Compile both files using latex
  3. to generate the package and documentation.

  4. Registering the new style using mktexlsr or texhash.

Then all you have to do is make your resume! Following the examples from the CTAN website is the best way to go. Personally I already had significant amount of content so most of my time was spent playing with layout.

It’s not perfect and I’ll be playing around to see if I can get better spacing in some of the sections that have a two column layout. The right most column is too narrow and forces date ranges on to multiple lines and I’m not a big fan of how that looks.