Calculating the MLE hash

My work to calculate PCR[18] from the last post was missing one big piece. I took a short cut and parsed the MLE hash out of the SINIT to MLE data table. This was a stop gap.
The MLE wasn’t being measured directly. We were still extracting the measurement as taken by the SINIT which is a binary blob from Intel. We don’t have a choice in trusting this blob from Intel but we can verify the measurements it takes. With this in mind I’ve gone back and added a tool to the pcr-calc module to calculate the MLE hash directly from the MLE.

The MLE Hash

Calculating the MLE hash is a bit more complicated than just hashing the ELF binary that contains it. There’s already a utility that does this in the tboot project though it’s pretty limited as it only dumps out the hash in a hex string. My end goal is to integrate this work into a bitbake class so having a python class to emit a hash object containing the measurement of the MLE is a lot more convenient.

In the pcr-calc project I’ve added a few things to make this happen. First is a class called mleHeader that parses the MLE header. This is just more of the mundane data parsing that I’ve been doing since this whole thing started. Finding the MLE header is just a matter of searching for the magic MLE UUID: 5aac8290-6f47-a774-0f5c-55a2cb51b642. Having the header isn’t enough though. The MLE must be extracted from the ELF and this is particularly hard because I know nothing about the structure of ELF files.

To do the extracting I basically ported the mlehash utility from tboot to python. The MLE is actually stored in the ELF file program header. This requires parsing and extracting the PT_LOAD segments. Writing a generic ELF parser is way beyond the scope of what I’m qualified to do but thankfully Eli Bendersky already has a handle on this. Check out pyelftools on his github page. You can download the package for pyelftools through the python package system like so:

$ pip install pyelftools

I’ve not yet integrated a check for this package into the pcr-calc autotools stuff yet but I’ll get around to it.

So in pcr-calc, the MLEUtil class does a few things. First it unzips the ELF file if necessary. Second, the ELFFile class from pyelftools is used to extract the PT_LOAD segments from the ELF. These are copied to a temporary file and the excess space is zero-filled. Once the ELF is extracted we locate the MLE header by searching for the UUID above. This header is represented and parsed by the mleHeader object.

The end goal is to calculate the SHA1 hash of the MLE. The fields in the header we need to do this are mle_start_off and mle_end_off. These are the offset to the start and end of the MLE respectively. Both offsets are relative to the beginning of the extracted ELF. The hash is then simply calculated over the data in this range.

Housekeeping

With the objects necessary to calculate the MLE hash done I went back and updated the pcr18 utility. Now instead of parsing the hash out of the TXT heap it now hashes the MLE directly. The mlehash program is constructed in a similar way but it is limited to calculating the MLE hash only.

Conclusion

A significant amount of the work in calculating the MLE hash was just code reading, firstly to understand how to extract and measure the MLE, second to understand how use the pyelftools package. Using pyelftools means that pcr-calc has a new dependency but it’s a lot better than implementing it myself. Working with pyelftools has been beneficial not only in that it saves me effort but it’s also an excellent example to work from. pcr-calc is my first attempt at implementing anything in python and it shows. Having poked around in pyelftools a little bit I’ve realized that even though my code “works” it’s pretty horrible. Future efforts to “clean up” pcr-calc will model significant portions of it after the code in pyelftools.

Having completed calculating the MLE hash we’ve taken a big step forward in our effort to construct future PCR values by measuring the individual components. It’s the last step in removing dependence on the extracted heap. We can now calculate PCR[18] and PCR[19] without any knowledge of or access to the deployed platform hardware and that’s pretty great. PCR[17] by contrast contains a whole bunch of stuff like the STM hash that’s independent from the Linux OS being run. For now I’m happy to assume PCR[17] is static for a system and doesn’t need to be calculated in the build system.

Eventually I’d like to extend pcr-calc to include mechanisms for ingesting an LCP and calculating PCR[17] but that’s a long way off. Instead, my next steps will be to clean up the pcr-calc code and integrating it into the meta-measured OE layer. The end goal here is to produce a manifest that a 3rd party (an installer or a remote system) can use to either seal secrets to a future platform state or for appraising an attestation exchange. More on this front next.