For those of you struggling to use the TPM2 SAPI (system API) I’ll put the TLDR here hopefully to save you time and suffering:
- Decoding TPM2 response codes is a PITA
- I wrote a tool to make it a bit easier on you. It’s called tpm2_rc_decode and you can find it in the tpm2.0-tools repo.
For those if you interested in the back-story and some general information about TPM2 response codes: read on.
A maintainers priorities
I’ve come to realize that the single most important thing (to me) when working with a new library or tool is having a simple / easy debugging cycle. In the most simple case, like a libc function, returning an integer error code that’s easy to understand is crucial. On Linux, the standard man pages and the obligatory ‘return value’ section are usually sufficient.
So when I took over maintenance of the TPM2.0-TSS project a few months back the first thing I did was try to write a simple program to use the most basic parts of the API. The program I wrote called a single function from the TSS API (Tss2_Sys_GetCapabilities) but of course it didn’t work initially. Worse yet the return value that I got from the function call wasn’t something that could be decoded with a simple look up.
This triggered a sort of revelation where I realized that we’ve made it nearly impossible for people to create meaningful bug reports. Back in June, someone reported the same issue on the tpm2.0-tools project: it’s extremely time consuming to decode TPM response codes by hand. This in turn means that neither the TPM2.0-TSS, or any of the projects consuming it will get good bug reports from their users. Generally this mean that they won’t have many users.
This is a pretty well bounded problem and seemed like an “easy win” that would solve two problems at the same time: make using the TPM2 TSS easier for users, and make for higher quality bug reports from said users.
TPM2 response code encoding
TPM2 response codes (hence forth RCs) aren’t simple integers. They’re unsigned 32bit integers with a pile of information encoded in them. The complexity has a purpose though: a single RC will tell you which part of the TPM software stack (TSS) produced the RC (from the TPM all the way up to the high level APIs), whether it’s an RC from the 1.2 or the 2.0 spec, which format the RC is in, the severity of the RC (warning vs error) and even which parameter caused the RC. Oh yeah, and what the actual error / warning code is. That’s a lot of data in my book.
The hardest part of decoding these RCs is tracking down data on the format and all of the other bits. A bit of reading and searching will turn up the following:
- Part #1 of the spec (architecture) has a good overview and flow chart for decoding RCs in section 39.
- Part #2 of the spec has the gory details on the bit fields and what they mean in section 6.6.
- The TSS spec documents the response code layer indicators and the TSS specific RCs in section 6.1.2 (NOTE: this will change in the next iteration of the spec)
For the sake of keeping this post as accurate as possible for as long as possible I won’t reproduce much data from the specifications. That’s what they’re for. Instead I’ll keep things limited to discussion of the tool that I wrote, some of the major annoyances that I encountered and some of the work I’m doing upstream to fix things.
tpm2_rc_decode
The algorithm for decoding RCs in part 1 of the spec was a good starting point but it’s not sufficient. It omits the details around decoding RCs generated by software outside of the TPM. For this I had to account for the ‘layer indicator’ from the RC. The augmented algorithm is documented in the commit message for the tm2_rc_decode.
Once this algorithm was implemented and documented the tool mostly becomes an exercise in looking up strings in tables that map some integer value (the error code in bits 0 through 5 or 6, or the layer identifier in bits 16 through 23). I had hoped to be able to automate the generation of these tables from the specification but parsing a PDF is a pain in the ass. I’ll probably end up posting more on automating code generation from the TPM2 specs in the future so I won’t say much here.
The neglected TSS spec
This was really my first bit of work that required I dig into the RC portion of the TSS specification. When I started this work it was in pretty bad shape. The structure and definition of all of the TSS RCs and their meaning was well done but the language used in this part of the spec was horribly inconsistent. It interchanged terms like ‘error code’ with ‘response code’ and ‘return code’ seemingly at random. It played similar games with terms like ‘error level’ and ‘layer indicator’.
Now I’ve been called pedantic in the past. And I won’t argue with that label. Attention to detail is a hobby of mine. And I’m of the opinion that, when it comes to technical specifications, pedantry is a virtue. When you’re trying to pick up concepts like this with data densely packed into a single unsigned 32bit integer exactness is paramount.
There’s nothing worse (in my mind) than having loosely defined terms causing confusion between people new to the spec. As a maintainer it’s hard enough to figure out what noobs mean when they report a bug. If the terms they get from the specification can mean any number of things we’re increasingly the likelihood of miscommunication and frustration.
Conclusion
So my first contribution to the TSS specification was the complete rewriting of the RC section. The values of the constants have remained the same but we now use consistent language for ‘layer indicator’, ‘layer shift’ etc and we’ve removed uses of terms like ‘error level’ and ‘error code’ in favor of ‘layer’ and ‘response’.
If you’re developing software using the SAPI, or using the tpm2.0-tools I highly encourage you to use the tpm2_rc_decode utility. It should make your life a lot easier and the lives of those you may end up communicating with when you’re trying to debug your code.
Finally this tool isn’t perfect. None of the RCs that are generated by the resource manager are handled yet so there’s plenty of room for improvement. If you’ve got the cycles and you’re sufficiently motivated I’d gladly take patches to improve the tool.
For what is worth my TPM 2.0 test scripts contain a very rudimentary error decoder: http://git.infradead.org/users/jjs/tpm2-scripts.git. It is not as detailed but I think it’s easier to get idea how the error is encoded (the decoder is just used to show what happened when one of my test tools fail).
LikeLike