The secrets of LTO tape

As I continue to develop a software solution to the problem described in https://darkimmortal.com/lto4-encryption-woes/, I have come across some bits of information about the inner workings of the LTO-4 on-tape format that are undocumented on the public internet.

There is no reason to suggest that these are unique to LTO-4 - the only significant generational differences I am aware of are:

LTO-4 introduces AES-GCM encryption
LTO-5 introduces key wrapping to the encryption, changing every 200 records or so (this seems over the top and potentially even bad for data robustness - a well-placed bitflip could lose you hundreds of MB)
LTO-5 introduces LTFS/partitons/etc which are irrelevant here
LTO-6 improves the compression. Wikipedia refers to an increase in "block size" but this must be nonsense, presumably what they have done is increase the SLDC history buffer size.

Here is my list of findings, and comparisons of HP vs IBM tape drives: (interesting bits in bold)

The encryption really is AES-GCM. This article is horse shit. Interesting bit is the last paragraph of page 2 (archive link) - absolute spew of nonsense AES acronyms. Wasted a lot of time entertaining this as truth.
The layout of the encryption format is defined vaguely in IEEE Standard 1619.1 (not freely available and not worth the effort) and more clearly in a free(?) document I found entitled Using AES in GCM Mode for Tape Encryption.
The AES-GCM IV/nonce is 96-bit. On HP tape drives this is 96 bits of RNG which is then incremented per record. On IBM tape drives this is 56 bits of RNG followed by zeroes, with the whole lot incremented per record. I'm not sure IBM's implementation is secure enough as per NIST 800-38D.
In both drives the RNG part of the IV is constant per tape insertion, and changes on a eject/load cycle. This seems moderately insecure, considering the situation of writing a tape, rewinding, and then writing it again will reuse IVs. (Maybe this was a contributing factor to the key wrapping added in LTO-5, as LTFS increases the chances of rewinding/rewriting.)
There are 16 bytes of Authenticated Additional Data. Even if no authenticated key data is set in the SCSI SECURITY PROTOCOL OUT command, the AAD is still present, containing mostly zeroes, and is essential for correct decryption of the ciphertext.
At the end of the record is a 16 byte tag which provides authentication. Fantastic means of detecting bitrot and a good reason to always use hardware encryption, even with an insecure key - it will beat any CRC etc done in hardware or any software hashing you can throw at it.
Encrypted/compressed blocks are not a consistent length, which is perhaps unsurprising given the compression. Dealing with these varying block lengths behind the linux block device abstraction is not fun.
HP half-heartedly tries to protect raw reads by requiring that records are written with the RDMC option to SECURITY PROTOCOL OUT set appropriately. IBM doesn't care and lets you raw read to your heart's content in all scenarios.
Raw reading is OP, I would recommend always enabling the hardware encryption and associated RDMC flag just to get access to raw reads. It is fantastic to be able to inspect the encrypted and, once decrypted, compressed forms of each record - a huge win for future data robustness and accessibility - potentially allowing manual recovery of corrupt records rather than binning the whole record.
Compression is indeed SLDC (Streaming Lossless Data Compression, as defined by ISO/IEC 22091:2002 and ECMA-321) as claimed by the specs. It is definitely not ALDC, although you can get something vaguely resembling the correct output by running data through ALDC decompression. The compression algorithm was named 'LTO DC' before SLDC was a published specification - as far as I am aware they are identical.
Compression is always on, even when disabled. When disabled, blocks are unconditionally written in Scheme 2, which inflates 0xFF bytes by 1 bit - a tiny hit for the majority of data, and also adds a handful of control bytes per record. This same mode is used for uncompressible data anyway if compression is enabled, so there is no reason to ever disable compression. The marketing material claiming that compression switches 'off' for incompressible data, and the posts claiming that a 'bit' in the record header toggles compression are incorrect.
Something is not quite right about IBM's implementation of SLDC compression. HP drives are able to understand it, but I have not been able to get ltoex working against IBM-written tapes. It works for simple test cases, but falls over against undocumented control symbols in larger data sets. I am 80% confident it's not a bug in ltoex.
Compression ratio is not equal between OEMS. From an identical source (/usr tar), an HP LTO-4 drive produced a compressed stream 4.3% larger than an IBM LTO-4 drive - therefore IBM has slightly better compression. This stands to reason as the compression algorithm of SLDC is not standardised - there are many possible ways to encode a given plaintext in SLDC, and IBM has gobloads of patents in the area.
Overall the specs aren't lying. In an apocalyptic event, you could probably recover your data without a drive by referring to published standards. Try saying that about a hard drive! It's nice to see an 'open' format with as many secrets as LTO does live up to its name.

Update: The software solution is now available at: https://github.com/lukefor/ltoex

8th March 2020

Dark's Code Dump

The secrets of LTO tape

Leave a Reply Cancel reply