December 2014 – technomasochism

I’d been using a rather old Sandy bridge system (Intel DQ67EP + i7 2600S) to test my work on meta-measured for a long time. Very nice, very stable system. But with Intel getting out of the motherboard business I started eyeing their new venture: the NUC.

The DC53427HYE vPro IVB NUC

Everything is getting smaller and thankfully Intel has finally caught on. Better yet they’re supporting TXT on some of these systems and so when the Haswell NUC was released over the summer the price on thevPro Ivy Bridge NUC (DC53427HYE) finally dropped enough to put it in my price range. Intel opted to skip the vPro NUC for Haswell anyways so it was my only option.

Let the fun of testing TXT on a new system begin! Like any new system we hope it works “out of the box”. But with TXT, odds are it won’t. My SNB system was great but this NUC … not so much, yet. The kicker though is that as systems get smaller something’s got to give. Space ain’t free and … well who needs a serial port anyways right?

Where’s my serial?

So without serial hardware, debugging TXT / tboot is pretty much a lost cause. Sure you can slow down the VGA output with the vga_delay command line option. But if you want to actually analyze the output you need to be able to capture the text somehow and setting vga_delay to a large value and then copying the output by hand doesn’t scale (and it’s a stupid idea to boot). So the search for serial output continues.

To get TXT we must … ::cough:: … endure the presence of the Management Engine (ME) and it’s supposed to have a serial console built in. The docs for the system even say you can get BIOS output from the ME serial console. But for whatever reason, I spent an afternoon messing about with it and made no progress.

I’ve no way to know where the problem with this lies. There are tools for accessing the ME serial console for Linux but I couldn’t get early boot output. Setting up a serial console login for a bare metal Linux system worked but no early boot stuff (BIOS, grub or tboot). Judging by the AMT docs for Linux: you can’t count on the ME serial interface for much. The docs state that if you use Xen then the ME will get the DHCP address all messed up and that setting a static address in the ME interface just doesn’t work. So long story short, the ME serial interface is limited at best and these limitations preclude getting early boot messages like those from tboot.

Now that the ME bashing is done we must fall back on real serial hardware. Thankfully this thing has both a half height and a full height mini-PCIe slot and a market for these arcane serial things still exists. StarTech fills this need with the 2s1p mini PCIe card. This is a great little piece of hardware but the I/O ports aren’t the default (likely to prevent conflict with on-board serial hardware) so we’ve gotta do some work before tboot will use it for ouput messages.

We have serial! Now what?

With some real serial hardware we’re half way there. Now we need to get tboot to talk to it. Unfortunately just adding serial to the logging= parameter in the boot config isn’t sufficient. The default base address for the serial I/O port used by tboot is 0x3F8 (see the README). This address corresponds to the “default serial port” aka COM1. So our shiny new mini-PCIe serial hardware must be using a different port.

tboot will log to an alternative port but we need to find the right I/O port address for the add on card. If you’re like me you keep a bootable Linux image on a USB drive handy for times like these. So we boot up the NUC and break out lspci to dump some data about our new serial card:

02:00.0 Serial controller: NetMos Technology PCIe 9912 Multi-I/O Controller (prog-if 02 [16550])
02:00.1 Serial controller: NetMos Technology PCIe 9912 Multi-I/O Controller (prog-if 02 [16550])

Not a bad start. This card has two serial ports and it shows up as two distinct serial devices. To get the I/O port base address we need to throw some -vvv at lspci. I’ll trim off the irrelevent bits:

02:00.0 Serial controller: NetMos Technology PCIe 9912 Multi-I/O Controller (prog-if 02 [16550])
        Subsystem: Device a000:1000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 17
        Region 0: I/O ports at e030 [size=8]
        Region 1: Memory at f7d05000 (32-bit, non-prefetchable) [size=4K]
        Region 5: Memory at f7d04000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
.
.
.

02:00.1 Serial controller: NetMos Technology PCIe 9912 Multi-I/O Controller (prog-if 02 [16550])
        Subsystem: Device a000:1000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin B routed to IRQ 18
        Region 0: I/O ports at e020 [size=8]
        Region 1: Memory at f7d03000 (32-bit, non-prefetchable) [size=4K]
        Region 5: Memory at f7d02000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
.
.
.

The lines we care about here are:

Region 0: I/O ports at e030 [size=8]
Region 0: I/O ports at e020 [size=8]

So the I/O port address for 02:00.0 is 0xe030 and 02:00.1 is 0xe020. The 9 pin headers on the board are labeled S1 and S2 so you can probably guess which is which. With the NUC booted off my Linux USB key we can dump more data bout the hardware so we know for sure but with a serial cable hooked up to S1 I just threw some text at the device to see if something would come out the other end:

echo "test" > /dev/ttyS0

Sure enough I got "test" out. So I know my cable is hooked up to ttyS0. Now to associate /dev/ttyS0 with one of the PCI devices so we can get the I/O port. Poking around in sysfs is the thing to do here:

ls /sys/bus/pci/devices/02:00.0/tty/
ttyS0

With all of this we know we want tboot to log data to I/O port 0xe030 so we need the following options on the command line: logging=serial serial=115200,8n1,0xe030.

Next time

Now that I’ve got some real serial hardware and a way to get tboot to dump data out to it I can finally debug TXT / tboot. We’ll save that for next time.

twobit-git-poller

Now that we’ve worked out the steps necessary to build the software we need to work out how to triggers builds when the source code is updated. Buildbot has extensive infrastructure for triggering builds ranging from classes to poll various SCM systems in a “pull” approach (pulled by buildbot) to scripts that plug into existing SCM repos to run as hook scripts in a “push” approach (pushed by the SCM).

Both approaches have their limitations. The easiest way to trigger events is to use the built in buildbot polling classes. I did use this method for a while with my OE projects but with OpenXT it’s a lost cause. This is because the xenclient-oe.git repo clones the other OpenXT repos at the head of their respective master branches. This means that unlike other OE layers the software being built may change without a subsequent change being made to the meta layer. To use the built in buildbot pollers you’d have to configure one for each OXT git repo and there’s about 60 of them. That’s a lot of git pollers and would require a really long and redundant master config. The alternative here is to set up a push mechanism like the one provided by the buildbot in the form of the git_buildbot.py script.

This sounds easy enough but given the distributed nature of OpenXT it’s a bit harder than I expected but not in a technical sense. Invoking a hook script from a git server is easy enough: just drop the script into the git repo generally as a ‘post-commit’ hook and every time someone pushes code to it the script will fire. If you control the git server where everyone pushes code this is easy enough. Github provides a similar mechanism but that would still makes for a tight coupling of the build master to the OpenXT Github org and here in lies the problem.

Tightly coupling the build master to the OpenXT Github infrastructure is the last thing I want to do. If there’s just one builder for a project this approach would work but we’d have to come up with a way to accommodate others wanting to trigger their build master as well. This could produce a system where there’s an “official” builder and then everyone else would be left hanging. Building something that leaves us in this situation knowingly would be a very bad thing as it would solve a technical problem but produce a “people problem” and those are exponentially harder. The ideal solution here is one that provides equal access / tooling such that anyone can stand up a build master with equal access.

Client-side hooks

The only alternative I could come up with here is to mirror every OpenXT repo (all 60 of them) on my build system and then invent a client side hook / build triggering mechanism. I had to “invent” the client side hooks because Git has almost no triggers for events that happen on the client side (including mirrors). My solution for this is in the twobit-git-poller.

It’s really nothing special. Just a simple daemon that takes a config specifying a pile of git repos to mirror. I took a cue from Dickon and his work here by using a small bit of the Github API to walk through all of the OpenXT repos so that my config file doesn’t become huge.

The thing that makes this unique is some magic I do to fake a hook mechanism as much like post-receive as possible. This allows us to use a script like the git_buildbot.py unmodified. So I expanded the config file to specify an arbitrary script so others can expand on this idea but this script is expected to take input identical to that produced by the post-receive and expected by the git_buildbot.py script.

This is implemented with a naive algorithm: every time the git poller daemon is run we just iterate over the available branches, grab the HEAD of each before and after the fetch and then dump this data into the hook script. I don’t do anything to detect whether or not there was a change even, I just dump data into the hook script. This allows us to use the git_buildbot.py script unmodified and buildbot is smart enough to ignore a hook even where the start and end hashes are the same. There are some details around passing authentication parameters and connection strings but those are just details and they’re in the code for the interested reader.

With this client side hooking mechanism we no longer need the poller objects in buildbot. We can just hook up a PBChangeSource that listens for data from the twobit-git-poller or any other change source. AFAIK this mechanism can live side by side with standard git hooks so if you’ve got an existing buildbot that’s triggered by some git repos that your team is pushing to using this poller shouldn’t interfere. If it does let me know about it so we can sort it out … or you could send me a pull request 🙂

Wrap Up

When I started this work I expected the git poller to be something that others may pick up and use and that maybe it would be adopted as part of the OpenXT build stuff. Now that I think about it though I expect the whole twobit-git-poller to be mostly disposable in the long run. We’ve already made huge progress in aligning the project with upstream Open Embedded and I expect that we’ll have a proper OE meta layer sooner than later. If this actually comes to pass there won’t be enough git repos to warrant such a heavy weight approach. The simple polling mechanisms in in buildbot should be sufficient eventually.

Still I plan to maintain this work for however long it’s useful. I’ll also be maintaining the renevant buildbot config which I hope to generalize a bit. It may even become as general as the Yocto autobuilder but time will tell.

Month: December 2014

Getting serial output on my Ivy Bridge NUC