I wrote a few days back about getting serial output from tboot on my new-ish Ivy Bridge vPro NUC. This was a means to and end and so this is where we’ll cover actually using this serial hardware to do something meaningful.
tboot log from the DC53427HYE NUC
Testing TXT and tboot on a new system is often painful. If there’s something wrong and tboot can’t execute SENTER successfully the system just reboots and will continually. This confuses the hell out of most and necessitates either serial hardware to capture log output for trouble shooting, or a patch to bypass this reboot logic.
The bit about the patch is interesting and I’ve hacked one out here. I hacked this together without much thought about the security implications as a work-around so I don’t recommend it for production use. It was only intended as a way to bring up a system with a borked TXT implementation so data could be collected with
Regardless of how you go about getting your tboot output it’s your first step in debugging. Here’s the tboot log captured from my DC53427HYE NUC. You can see the automated reboot after SENTER fails.
Debugging my tboot failure
I’ve jacked up the logging level of
tboot so there’s a lot of data go dig through in the log. Generally though the data we need is in the
TXT.ERRORCODE. Don’t forget though that on the first boot this value will be
0x0 since it’s only set once the failure occurs. The interested reader can, well, read all about this register in the MLE developers guide section B.1.3.
So after the failed boot the
TXT.ERRORCODE gets set and we can grab it from the log. The relevant line is:
TBOOT: TXT.ERRORCODE: 0xc0021041
Not a particularly helpful error message but then again, there isn’t much space for a helpful textual description of the error in a 32bit register. So the next step is to decode this thing.
Decoding the TXT.ERRORCODE
The MLE developers guide describes the general structure of the data in this register but the error code itself is specific to the ACM used by the platform. Again this data is in the tboot log file:
TBOOT: checking if module /acm_hsw.bin is an SINIT for this platform... TBOOT: chipset production fused: 1 TBOOT: chipset ids: vendor: 0x8086, device: 0xb001, revision: 0x1 TBOOT: processor family/model/stepping: 0x306a9 TBOOT: platform id: 0x10000000000000 TBOOT: 1 ACM chipset id entries: TBOOT: vendor: 0x8086, device: 0xb002, flags: 0x1, revision: 0x1, extended: 0x0 TBOOT: chipset id mismatch TBOOT: checking if module /acm_ivb.bin is an SINIT for this platform... TBOOT: 1 ACM chipset id entries: TBOOT: vendor: 0x8086, device: 0xb001, flags: 0x1, revision: 0x1, extended: 0x0 TBOOT: 4 ACM processor id entries: TBOOT: fms: 0x206a0, fms_mask: 0xfff3ff0, platform_id: 0x10000000000000, platform_mask: 0x1c000000000000 TBOOT: fms: 0x206a0, fms_mask: 0xfff3ff0, platform_id: 0x4000000000000, platform_mask: 0x1c000000000000 TBOOT: fms: 0x306a0, fms_mask: 0xfff3ff0, platform_id: 0x10000000000000, platform_mask: 0x1c000000000000 TBOOT: SINIT matches platform
You can see on line 1 that tboot is trying the HSW / Haswell ACM which doesn’t match the platform. Then on line 9 it gives the IVB / Ivy Bridge a try and that one matches the platform. So our error code is specific to the IVB ACM so we’ll have to dig through the docs from that tarball. If you’ve built meta-measured the appropriate PDF will be located at:
But before we can make use of this we’ve gotta parse out the error code into its component parts. From this doc they’re defined as (from MSB to LSB):
- bit 31 – Valid
- bit 30 – External
- bits 29:25 – Reserved
- bits 24:16 – Minor Error Code
- bit 15 – SW Source
- bits 14:10 – Major Error Code
- bits 9:4 – Class Code
- bits 3:0 – Module Type
So we need to divide the error code
0xc0021041 on these boundaries and then go back to the docs to figure out what each field means:
Valid: 0x1 - The error code is valid. External: 0x1 - Error state induced by external software. Reserved: 0x0 - No significance. Minor Error Code: 0x2 - Fatal and TPM specific. SW Source: 0x0 - Generated by the ACM. Major Error Code: 0x4 - TPM NV is unlocked. Class Code: 0x4 - TPM Access Module Type: 0x1 - SINIT Module
Going through each of these takes a while so we’ll focus on the important stuff: The Major error code. Actually the error text says it all: The TPM NV RAM is unlocked and it shouldn’t be. With the TPM in this state tboot also complains in the boot log, see line 3 below from the log linked above:
TBOOT: TPM: TPM Family 0x0 TBOOT: TPM is ready TBOOT: TPM nv_locked: FALSE TBOOT: TPM timeout values: A: 750, B: 750, C: 750, D: 750
So there’s the problem. Now what’s the solution?
Unlocked TPM NVRAM
The TPM NVRAM is described in the relevant TCG TPM 1.2 spec, section 19: “NV Storage Structures”. The parts relevant to us is 19.1.1 where the required TPM_NV_INDEX values are described. These are the NV indexes that “must be found on each TPM regardless of platform”.
The first index listed in this section is
TPM_NV_INDEX_LOCK and given the error code we’re getting that looks relevant. A bit of reading and you’ll see why having this index defined on a TPM is so important and why shipping a platform with it undefined is considered a security issue. Turns out that without this index defined the TPM doesn’t enforce authorization protections. In this state an attacker can write to the NVRAM repeatedly, wear it out (since NVRAM can be written to a finite number of times) and effectively DOS the TPM completely by making the NVRAM unusable. Very surprising that Intel is shipping the NUC in this state. Anyways, nothing we can’t fix …
At this point I went back in to the NUC and used
tpm_nvinfo to dump the NVRAM indexes defined on my platform:
NVRAM index : 0x10000001 (268435457) PCR read selection: Localities : ALL PCR write selection: Localities : ALL Permissions : 0x00001002 (WRITEALL|OWNERWRITE) bReadSTClear : FALSE bWriteSTClear : FALSE bWriteDefine : FALSE Size : 20 (0x14) NVRAM index : 0x1000f000 (268496896) PCR read selection: Localities : ALL PCR write selection: Localities : ALL Permissions : 0x00020002 (OWNERREAD|OWNERWRITE) bReadSTClear : FALSE bWriteSTClear : FALSE bWriteDefine : FALSE Size : 1129 (0x469) NVRAM index : 0x50010000 (1342242816) PCR read selection: Localities : ALL PCR write selection: Localities : ALL Permissions : 0x00000001 (PPWRITE) bReadSTClear : FALSE bWriteSTClear : FALSE bWriteDefine : FALSE Size : 10 (0xa) NVRAM index : 0x50000003 (1342177283) PCR read selection: Localities : ALL PCR write selection: Localities : 0x18 Permissions : 0x00000000 () bReadSTClear : FALSE bWriteSTClear : FALSE bWriteDefine : FALSE Size : 64 (0x40) NVRAM index : 0x50000001 (1342177281) PCR read selection: Localities : ALL PCR write selection: Localities : ALL Permissions : 0x00002000 (WRITEDEFINE) bReadSTClear : FALSE bWriteSTClear : FALSE bWriteDefine : FALSE Size : 54 (0x36)
I was hoping that the
TPM_NV_INDEX_LOCK (defined in index 0xffffffff) would be missing and that defining it would solve my problem. From the output above you can see that it’s not listed. The only relevant data I could find on the web about defining this index was a post on the tboot devel list with somone trying to use the
tpmj utility. Digging into all of that java seemed like way too much work so I gave the
tpm_nvdefine utility a go:
root@intel-core-i7-64:~# tpm_nvdefine --index=0xffffffff --size=0 Successfully created NVRAM area at index 0xffffffff (4294967295).
Success! I had hoped then that executing
tpm_nvinfo again would show this new index … but it doesn’t. So other than the “success” message from
tpm_nvdefine we have no way of knowing whether or not the new index was actually defined. The easiest way to test this is to try booting with tboot again and hope that the error goes away. And it does:
TBOOT: TPM: TPM Family 0x0 TBOOT: TPM is ready TBOOT: TPM nv_locked: TRUE TBOOT: TPM timeout values: A: 750, B: 750, C: 750, D: 750
So that’s how you define the
TPM_NV_INDEX_LOCK TPM index on your IVB NUC. This effectively locks the TPM NVRAM on a platform that ships with the the TPM NVRAM unlocked. Until now I had only seen this on Lenovo systems (lots of them) but I guess Intel is shipping platforms like this too. Having some automated way to detect and fix platforms in this state would be really nice …