Bab

  1. Introduction
    1. The Bab Hash Function
      1. Parameters
        1. The Merkle Tree
          1. Instantiations
            1. Via Conventional Hash Functions
              1. WILLIAM3
            2. Design Rationale
              1. Streaming Verification
                1. Slice Verification
                  1. Length Verification
                    1. The is_root Flag
                      1. No Chunk Indices
                        1. The Binary Tree
                          1. Parameterized Chunk Size
                          2. Optimized Streaming Verification
                            1. Unverifiable Sequences
                              1. Omitting Left Labels
                                1. Omitting Lower Layers
                                  1. Slice Streaming
                                  2. Conclusion
                                    1. Acknowledgements
                                      1. References

                                        Bab is a family of secure hash functions, heavily inspired by BLAKE3. Bab hashes allow for streaming verification of strings, similar to the BLAKE3-based Bao algorithm. We discuss several optimization techniques that go beyond the Bao specification. Further, unlike BLAKE3 and Bao, Bab digests allow for constant-sized length proofs of their strings.

                                        1 Introduction

                                        Content-addressable storage lets users refer to strings (files) by a secure hash. Bab is a specification for a hash function specifically designed for usage in peer-to-peer content-addressable storage systems.

                                        The key feature of Bab is streaming verification of a string: as a program reads a string, it can verify at regular intervals that the prefix read so far is indeed part of a string hashing to some target digest. Content-addressed storage systems without this feature face a problem in byzantine environments. Suppose a client requests the string corresponding to some digest. The server replies with some data, but the connection is interrupted before the full string was transfered. What should the client do?

                                        It could store the data it received, and could later issue a request for the same string, resuming at the correct offset. But that request will fail if the orginal server fed it garbage, and the client will be unable to tell which of its two communication partners acted maliciously. Further, the server might have sent it data that is illegal to store. Hence, a careful client should not persist incomplete data. But this makes it unlikely to successfully receive large strings over an unreliable network.

                                        When strings are hashed with Bab, servers can embed into the data stream small proofs that the data they sent so far is indeed prefixing the requested string. More specifically, Bab enables the following features (all explained in detail in Section 3):

                                        • secure content-addressing,
                                        • incremental verification of streaming string data,
                                        • efficient verification of arbitrary subslices within a string, and
                                        • constant-size length proofs for strings.While we explore Bab in the context of hashing, you can also think of it as an authenticated data structure (Tamassia, 2003) for arrays.

                                        Bab employs many of the same techniques as the BLAKE3-based (O’Connor et al., 2020) Bao specification. Key differences are short proofs of string length, more efficient slice requests, and a hash-function-agnostic specification. Bao is a byproduct of a well-designed general-purpose hash function, whereas Bab is a family of special-purpose hash functions that gets to add features beyond Bao’s capabilities.

                                        We define Bab in Section 2, before explaining and justifying the design in Section 3, both from a high-level view and in its details. We examine optimizations for verified streaming in Section 4.

                                        2 The Bab Hash Function

                                        Like BLAKE3, Bab hashes an input string by splitting it into chunks, and arranging the chunks as the roots of a deterministically constructed, binary Merkle-tree; the label of the root becomes the digest of the input.

                                        2.1 Parameters

                                        Bab leaves open some paramters, they need to be given as “input” to the Bab specification.

                                        The chunk_size value must be a natural number greater than zero. Given an input bytestring in of at most bytes, Bab splits it into a non-empty sequence chunks of chunks: the first chunk_size bytes become the first chunk, the next chunk_size bytes become the second chunk, and so on. Only the final chunk may be shorter than chunk_size. For example, splitting the ASCII string hello_world with a chunk_size of 2 yields the chunks he, ll, o_, wo, rl, and d. If in is the empty string, then chunks consists of a single, empty chunk.

                                        The width value must be a natural number greater than zero. Bab maps input bytestrings (of at most bytes) to digests of width many bytes.

                                        The hash_chunk function determines the labels of the leaf nodes of Bab’s Merkle-tree. It must be a function that maps a chunk and a boolean flag (whether the node is the root node or not) to a bytestring of width bytes.

                                        The hash_inner function determines the labels of the inner nodes of Bab’s Merkle-tree. It must be a function that maps a quadruplet of two bytestrings of width bytes (the labels of the two child nodes), an unsigned 64-bit integer (the length of the input substring covered by the node), and a boolean flag (whether the node is the root node or not) to bytestrings of width bytes.

                                        2.2 The Merkle Tree

                                        To hash a bytestring in that was split into a chunk sequence chunks, Bab constructs the unique binary tree with one leaf per chunk such that all left subtrees are complete trees, and the size of the left subtrees is strictly decreasing from left to right11This characerization is taken from the BLAKE3 paper (O’Connor et al., 2020). The construction is identical to that of the certificate transparency logs of RFC 9162 (Laurie et al., 2021).. we number the vertices in depth-first order, prioritizing left children over right children, starting at in the root. Figure 1 shows an example.

                                        The Merkle-tree for the input string *hello_world* at a chunk size of two bytes.

                                        Figure 1: Unlabeled Example Tree

                                        The (unlabeled) binary tree for the input string hello_world with a chunk_size of 2.

                                        Bab assigns to each vertex v a label lbl(v); the root label serves as the digest of in.

                                        If v is a leaf vertex corresponding to some chunk c, then lbl(v) is hash_chunk(c, ) if c is the only chunk in chunks, and hash_chunk(c, ), otherwise.

                                        If v is an inner vertex, let left denote its left child, let right denote its right child, let len denote the total length of the chunks corresponding to all leaf descendents of v, and let is_root be if v is the root of the tree and otherwise. Then lbl(v) is hash_inner(left, right, len, is_root).

                                        Figure 2 demonstrates the label computation in the example Merkle-tree:

                                        The Merkle-tree for the input string *hello_world* at a chunk size of two bytes, where each vertex shows how to compute its label.

                                        Figure 2: Labeled Example Tree

                                        The label computations for the example tree from Figure 1.

                                        2.3 Instantiations

                                        The choice of parameters for using Bab is flexible, but some care needs to be taken to create a secure (i.e., collision-resistant and preimage-resistant) hash function. hash_chunk and hash_inner must be secure hash functions themselves, and it must further be impossible to find an input to hash_chunk and one to hash_inner such that both yield the same digest.

                                        We supply two useful instantiations: one based on arbitrary secure hash functions, and a more efficient one that closely mimics BLAKE3.

                                        2.3.1 Via Conventional Hash Functions

                                        Given a conventional, secure hash function h of digest size w, we can instantiate Bab for a width of w and an arbitrary chunk_size.

                                        Define hash_chunk(chunk, is_root) as applying h to the concatenation of

                                        • chunk, and
                                        • the byte 0x00 if is_root is false, or the byte 0x01 otherwise.

                                        Define hash_inner(left, right, len, is_root) as applying h to the concatenation of

                                        • left,
                                        • right.
                                        • len encoded as an unsigned big-endian 64-bit integer, and
                                        • the byte 0x02 if is_root is false, or the byte 0x03 otherwise.

                                        2.3.2 WILLIAM3

                                        WILLIAM3 is a Bab instantiation that is almost identical to BLAKE3. There are three differences: first, WILLIAM3 does not supply chunk indices into the label computation for chunks. Second, WILLIAM3 incorporates a length value into the label computation of inner tree vertices. And third, WILLIAM3 uses different constants than BLAKE3. WILLIAM3 has a normal hash mode and a keyed hash mode (based on a 256 bit key, by setting the key words to to the key just like BLAKE3); unlike BLAKE3 it does not support a key derivation mode, and it does not allow for extendable output.

                                        WILLIAM3 uses a chunk_size of 1024 bytes (just like BLAKE3). Its width is 32 bytes (just like BLAKE3).

                                        Where BLAKE3 uses its constants WILLIAM3 uses the following constants instead:The constants form the BLAKE3 digest of the ASCII-encoded string WILLIAM3.Using diferent constants than BLAKE3 ensures that BLAKE3 and WILLIAM3 produce non-equal digests for equal inputs, even if the inputs fit into a single chunk.

                                        • : 0xc88f633b
                                        • : 0x4168fbf2
                                        • : 0x6ba32583
                                        • : 0xb0ff1847
                                        • : 0xac57e47d
                                        • : 0xa8931330
                                        • : 0x796a4645
                                        • : 0x6b28a3ee

                                        The hash_chunk function is almost identical to the BLAKE3 computation of chunk chaining values, with a single exceptionExplained in Section 3.5. (beyond the changed constants): where BLAKE3 sets the t parameter of its compression function to the chunk index, WILLIAM3 sets it to 0.

                                        The hash_inner function is almost identical to the BLAKE3 computation of parent node chaining values, with a single exceptionExplained in Section 3.3. (beyond the changed constants): whereas BLAKE3 sets the t parameter of its compression function to 0, WILLIAM3 sets t to the third argument (the length value) of hash_chunk (as an unsigned little-endian 64-bit integer).

                                        3 Design Rationale

                                        We now explain the decisions that went into the definition of Bab.Bab is heavily based off BLAKE3 and Bao, so many of these rationales are restating the design work that went into BLAKE3 and Bao. Compared to those two, only little creativity went into the design of Bab.

                                        3.1 Streaming Verification

                                        Using the root label of a Merkle-tree as the digest allows verifying that some string is a prefix of a string of a known digest. To do so, each chunk is preceded by the labels of the left and right children of all inner vertices on the path from the root of the tree to that chunk. As an optimization, each label is transmitted at most once; it is the receiver’s responsibility to cache labels until they are not needed for verification any longer. This data stream is preceded by the length of the string to yield the baseline verifiable stream. Figure 3 gives an example:

                                        1122The length of the original string in bytes., lbl(2), lbl(9), lbl(3), lbl(6), lbl(4), lbl(5), he, ll, lbl(7), lbl(8), o_, wo, lbl(10), lbl(11), rl, d.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the verified data stream.

                                        Figure 3: Verifiable Streaming Example

                                        The vertices of our recurring example tree, each showing the data that they contribute to the baseline verifiable stream that lets a client incrementally verify the digest of the string hello_world.

                                        Listing the vertices in the order in which they contribute their child labels or chunks is quite instructive: Can you spot the pattern?

                                        The client can verifies stream by eargerly reconstructing the labels of tree nodes, and asserting that the computed labels match the received data and the lengths of the data match the length values used in the computation of inner labels33Particular care must be taken at the end of the data stream: the end of input implicitly defines the length of the final chunk, and that implicit length must be validated to match the length(s) explicitly used in the digest.. You can go through the verification process for the example stream step by step below. Each step consists of reading either a full label or a full chunk from the stream. The graphic shows the Merkle tree and indicates for each node the knowledge that the client has about it: dim if the client has not yet received any data concerning it, diagonally striped orange if the client has received data that it will need later but cannot verify yet, green if it could verify the data and needs to keep it for verification of subsequent data, and gray if it has verified the data and will not need it for any future verification steps.

                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)

                                        3.2 Slice Verification

                                        The Merkle-tree design allows for verifiable streaming of any slice (of chunks) within a string of known digest. Assume a client wants to receive some number of chunks, starting at some chunk offset. The response data is defined with the same technique as for the baseline verifiable stream: the transmission of each chunk in the slice is preceded by the labels of the left and right children of all inner vertices on the path from the root of the tree to that chunk. Chunks outside the slice do not contribute any data. Again, each label is transmitted at most once. Since the length of the slice is known to the client, the response need not be prefixed by the length. We call this transmission the baseline verifiable slice stream. Figure 4 shows an example of which data needs to be transmitted when the client requests three chunks, starting at offset two (zero-indexed).

                                        lbl(2), lbl(9), lbl(3), lbl(6), lbl(7), lbl(8), o_, wo, lbl(10), lbl(11), rl.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the verified data stream for the subslice *o_worl*.

                                        Figure 4: Verifiable Slice Streaming Example

                                        The vertices of our recurring example tree, each showing the data that they contribute to let a client incrementally verify the slice o_worl in the string hello_world.

                                        The list of vertices in the order in which they contribute their child labels or chunks has some gaps, but is still strictly ascending:

                                        You can step through the verification process for this example below:

                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        2
                                        9
                                        3
                                        6
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(1) is trusted
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(wo, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)

                                        3.3 Length Verification

                                        A server can provide a short proof for the length of any string to a client who already has the digest of that string. If the string fits into a single chunk, the proof simply is that chunk itself (the client can reconstruct the digest and assert that it matches). If the string is longer than a single chunk, then the root of the Merkle tree is an inner node. The server can then supply the length of the string together with the labels of the left and right children of the root. The client feeds these into hash_inner and asserts that the result matches the digest.

                                        BLAKE3 does not incorporate lengths into the computation of inner vertex labels. BLAKE3 still supports length proofs, these consist of the length followed by the same data as a reply to a slice request for only the final chunk. Such a proof always contains at least a full chunk, plus twice the height of the tree in labels; its size is logarithmic in the length of the string. For a moderately short string (say, 4096 bytes), the length proof via BLAKE3 has a size of bytes. The corresponding WILLIAM3 lenght proof, in comparison, requires bytes.

                                        Even when short length proofs are not needed, a Bab instantiation must not ignore the length argument in hash_inner. Doing so opens up certain attacks where an attacker can create invalid trees whose baseline verifiable slice stream would pass verification for certain slices.

                                        3.4 The is_root Flag

                                        Both hash_chunk and hash_inner take a boolean is_root flag as an argument. This flag must influence the output to guard against length extension attacks. The flag ensures subtree-freeness (Daemen et al., 2018) (also called final-node separability (Bertoni et al., 2014)), which prevents length extension attacks. Ignoring the flag does not make the resulting hash function insecure, but it does open up length-extension attacks.

                                        3.5 No Chunk Indices

                                        BLAKE3 includes the index of each chunk when computing leaf labels in the Merkle-tree. Bab does not. The authors of BLAKE3 give a clear justification for why they added chunk indices:See the BLAKE3 paper, section 7.5.

                                        We [...] use the t parameter during the input phase, to indicate the chunk number. This means that each chunk has a distinct CV44“Chaining Value”, their terminology for Merkle-tree labels. with high probability, even if two chunks have the same input bytes. This is not strictly necessary for security, but it discourages a class of dangerous optimizations.

                                        Consider an application that hashes sparse files, which are mostly filled with zeros. The majority of this application’s input chunks could be the all-zero chunk. This application might try to compute the CV of the zero chunk only once, and then check each chunk before compressing it to see whether it can use that precomputed CV. This check is cheap, and for this application it could be a big speedup. But this optimization is dangerous, because it is not constant-time. The runtime of the hash function would leak information about its input.

                                        If tricks like this were possible, an unsafe implementation would inevitably find its way into some privacy-sensitive use case. By distinguishing each chunk, BLAKE3 deliberately sabotages these tricks, in the hopes of keeping every implementation constant-time.

                                        Bab is decidedly not a general-purpose hash function. It has no ambition to be used as a building block in, say, a digital signature scheme. Hence, we see little value in adding complexity just to deter a certain class of implementation techniques.

                                        3.6 The Binary Tree

                                        Binary trees are not the only candidate digraphs as the merkelization backbone. Other candidate constructions include ternary trees (which are more efficient than binary trees in some applications), and efficient binary linking schemes. However, for verified streaming, binary trees turn out to be more efficient.

                                        Interestingly enough, the humble linked list, aka hash chain, has some intriguing properties. If the Merkle-DAG was a linked list where the label of the second chunk incorporates the label of the first chunk, the label of the third chunk that of the second, and so on, verified streaming could be done without sending any internal labels — by sending chunks in reverse order. Unfortunately, the worst-case size of slice verification proofs would be linear in the size of the tree. This dilemma is similar to that encountered in secure timestamping and related problems.

                                        Also note that the Merkle tree structure allows for easy parallelization of hash computation, whereas a linked list (or any linking scheme) completely precludes parallelization.

                                        3.7 Parameterized Chunk Size

                                        Bab leaves the chunk_size as a freely choosable parameter, because different values incur different tradeoffs. At the most basic, a larger chunk_size shrinks the Merkle tree and thus reduces the metadata overhead in verified streaming, but it also increases the amount of data that needs to be read in sequence without being able to verify it immediately. The chunk_size also serves as an upper bound to the size of proofs of string length in Bab. Finally, the chunk size also affects the performance of computing digests, see the discussion in Section 7.1 of the BLAKE3 paper (O’Connor et al., 2020).

                                        4 Optimized Streaming Verification

                                        The baseline verifiable stream imposes a certain overhead compared to transmitting a raw string. When instantiating with a width of 32 bytes and a chunk_size of 1024 bytes (like BLAKE3 and WILLIAM3), roughly 3.1% of streaming data is metadata. This is already a fairly low overhead, but it turns out we can do better.

                                        There are redundancies in the baseline verifiable stream. Consider again Figure 3, where the stream starts with (lbl(2), lbl(9), lbl(3), lbl(6), …). lbl(2) can be computed from lbl(3) and lbl(6), and both of those are transmitted. So why transmit the redundant lbl(2)?

                                        Technically, all label transmissions are redundant: if the server sends only the chunks (i.e., simply the string itself), the client can successfully reconstruct the digest. The interesting part is the longest consecutive sequence of bytes that the client receives without being able to verify. When sending the raw string, that sequence is simply all of the string but its final byte. The baseline verifiable stream minimizes the length of the longest unverifiable sequence. A scheme that skips lbl(2) sits between the two extremes55Or at least it appears to do so at first glance.. We now examine the notion of unverifiable subsequences in a verification data stream, and how to leverage it for optimizations.

                                        4.1 Unverifiable Sequences

                                        When receiving a string interleaved with verification metadata, we call a transmitted label verified if it is the root label66We assume that the client trusts the digest it uses for a request., or if it and its sibling label have been fed into hash_inner, yielded the label of the parent vertex, and the parent label has also been verified. We call a chunk verified if it has been fed into hash_chunk, yielded the label of the vertex corresponding to the chunk, and that label has also been verified.

                                        An unverifiable sequence is a sequence of consecutive bytes in a response stream whose corresponding labels and chunks cannot yet be verified. We call the length of a maximal unverifiable sequence of a stream its verification delay.

                                        What are the unverifiable sequences in the baseline verifiable stream? Any partially transmitted label is unverifiable. Further, a label of a left child is unverifiable without the label of its sibling. Since these are always transmitted in succession, the longest unverifiable sequences contributed by labels have length Chunks can only be verified when their final byte gets received, so they contribute unverifiable sequences of length

                                        This puts the verification delay of the baseline verifiable stream at the maximum of and In most practical instantiations, the chunk_size should dominate77BLAKE3 and WILLIAM3, for example, have a width of 32 bytes and a chunk_size of 1024 bytes.. For the remainder of this document, we will assume that yielding a verification delay of the baseline verifiable stream of

                                        The verification delay gives an upper bound of how much progress is lost if a connection failure occurs in the worst possible moment. It also gives the greatest amount of untrusted data that a client has to buffer in memory simultaneously88This number is one greater than the verification delay, because the final byte also needs to be loaded into memory for verification. Verification of chunks might not require holding the full chunk in memory, but we assume that the client requested the chunk data for a reason, and thus has to hold on to it even if not strictly necessary for verification alone..

                                        Note that this required buffer capacity is mostly independent from the memory that the client needs for buffering verified labels for reuse in later assertions. Every label of a right child needs to be kept in memory after verification, in order to verify the labels of its two children. Hence, verification always requires the capacity to buffer one label per layer of the Merkle-tree (see this example).

                                        4.2 Omitting Left Labels

                                        When a response stream omits a left label, the client cannot verify the corresponding right label when it arrives. But it can, at a later point, reconstruct the left label, and then use the reconstructed data to verify both the right label and the reconstructed one. We now describe this process more precisely, and then we investigate how to improve upon the baseline verifiable stream with this technique.

                                        Let p be an inner Merkle-tree vertex with left child l and right child r. What precisely happens when we omit lbl(l) from the baseline verifiable stream?

                                        After the omitted label, the stream continues with lbl(r). Unlike in the unmodified baseline verifiable stream, the client cannot verify lbl(l) immediately; the client needs to buffer it as unverified data instead. Next, the stream continues with the data corresponding to l. There are two cases, depending on whether l is a leaf vertex or an inner vertex.

                                        If l is a leaf vertex, then the stream continues with a chunk. This chunk can be fed into hash_chunk to recover a (still unverified) candidate label of l. Feeding this candidate label and the (still unverified) lbl(r) into hash_inner yields a candidate label for p. The client can now compare the candidate label for p against the actual lbl(p) that it received and verified earlier in the stream. If they are equal, then both the received chunk and the received lbl(r) are successfully verified.

                                        If l is an inner vertex, then the stream continues with the labels of the left and right children of l. These can be fed into hash_inner to recover a (still unverified) candidate label of l. The client can then perform the same verification steps as in the first case, leading to verification of both the labels of the children of l, and of lbl(r).

                                        When omitting the label of a leaf vertex l, the verification delay increases99Remember our assumption that by width. But when l is an inner vertex, an unverifiable sequence of length is turned into an unverifiable sequence of length If then the verification delay remains unchanged — despite omitting width many bytes from the verification stream!

                                        Moreover, we can omit successive left labels from the baseline verifiable stream by iterating the reconstruction of the labels. For each successively omitted left label, the corresponding right label needs to be buffered in an unverified state. Figure 5 shows the stream when omitting all left labels except those that are labels of leaf vertices for the hello_world example (that is, omitting the labels of vertices and

                                        111010The length of the original string in bytes., lbl(9), lbl(6), lbl(4), lbl(5), he, ll, lbl(7), lbl(8), o_, wo, lbl(10), lbl(11), rl, d.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the verified data stream, but omitting the labels for vertices 2 and 3.

                                        Figure 5: Omitting Left Labels Example

                                        The labels that remain in the baseline verifiable stream when omitting left labels (except those that are labels of leaves).

                                        Below, you can step through the verification process. Step 4 demonstrates the cascading verification of several buffered, unverified right labels.

                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        9
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)

                                        This construction suggests a natural optimization over the baseline verifiable stream: keep the labels of all leaf vertices, but omit as many left labels on the paths from the leaves to the root as can be omitted without increasing the verification delay. If the verification delay would be increased, instead include the label, but then resume omitting as many left labels as possible along the remaining path to the root again. This way, only one out of left labels needs to be transmitted.

                                        More precisely: define the layer of a tree vertex as the length of the path from it to its leftmost leaf. The layer of a leaf is the layer of a vertex whose left child is a leaf is and so on. Assume that Then, the light verifiable stream of a string is its baseline verifiable stream, except omitting those labels that are contributed to the stream as left children and that are not labelling vertices whose layer is divisible by

                                        BLAKE3 and WILLIAM3 have a width of 32 bytes and a chunk_size of 1024 bytes, putting the quotient at Omitting all but one out of 32 left labels reduces total number of labels by a factor of Whereas the baseline verifiable stream has a verification metadata overhead of roughly 3.1% for Bao and WILLIAM3, the light verifiable stream reduces the overhead to roughly 1.6%, without increasing the verification delay.

                                        4.3 Omitting Lower Layers

                                        In addition to the light verifiable stream, there is an elegant way to reduce metadata transmission in exchange for increasing the verification delay: skipping all metadata for the lowest layers of the Merkle tree. These labels have to be reconstructed from successive chunks; reconstructing a label of layer requires chunks. Hence, the verification delay increases to

                                        In the upper layers of the tree, we omit left labels according to the new verification delay: the left label of vertices on layer must be present, but the labels of the next are skipped. The labels of the next layer are included, then more layers can be skipped, and so on. We call this stream the k-grouped light verifiable stream.

                                        We can also apply the optimization of omitting the lowest layers without omitting left labels higher up in the tree. We call the resulting stream the k-grouped baseline verifiable streamFor the k-grouped light verifiable stream coincides with the light verifiable stream, and the k-grouped baseline verifiable stream coincides with the baseline verifiable stream.. Since the k-grouped light verifiable stream has the same verification delay but is shorter, it should be preferred.

                                        Figure 6 shows the 1-grouped light verifiable stream for our running example.

                                        111111The length of the original string in bytes., lbl(9), lbl(3), lbl(6), he, ll, o_, wo, rl, d.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the 1-grouped light verifiable stream.

                                        Figure 6: 1-Grouped Verifiable Stream Example

                                        The verification data to send for hello_world with a chunk_size of two in the 1-grouped light verifiable stream.

                                        Below, you can step through the verification process:

                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)
                                        9
                                        3
                                        6
                                        he
                                        ll
                                        o_
                                        wo
                                        rl
                                        d
                                        1
                                        2
                                        9
                                        3
                                        6
                                        4
                                        5
                                        he
                                        ll
                                        7
                                        8
                                        o_
                                        wo
                                        10
                                        11
                                        rl
                                        d
                                        lbl(2) = hash_chunk(lbl(3), lbl(6), 8, false)
                                        lbl(1) = hash_chunk(lbl(2), lbl(9), 11, true)
                                        lbl(4) = hash_chunk(he, false)
                                        lbl(5) = hash_chunk(ll, false)
                                        lbl(3) = hash_chunk(lbl(4), lbl(5), 4, false)
                                        lbl(7) = hash_chunk(o_, false)
                                        lbl(8) = hash_chunk(o_, false)
                                        lbl(6) = hash_chunk(lbl(7), lbl(8), 4, false)
                                        lbl(10) = hash_chunk(rl, false)
                                        lbl(11) = hash_chunk(d, false)
                                        lbl(9) = hash_chunk(lbl(10), lbl(11), 3, false)

                                        4.4 Slice Streaming

                                        All optimized verifiable streams can also be used for verifiable streaming of slices, by incorporating two changes. First, a label or chunk that would not be transmitted as part of the baseline verifiable slice stream is also not part of the optimized chunk stream1212This simply filters down the stream from a full-string stream to a slice stream.. And second, the labels of vertices which do not lie on a path from a chunk of the slice to the root but which would be included in the baseline verifiable chunk stream are never omitted1313If such a label was omitted, the label of the corresponding parent vertex would be impossible to reconstruct..

                                        Figure 7 and Figure 8 give examples of a left and right label respectively being included in an optimized slice stream despite being omitted when requesting the full string with the same optimizations.

                                        lbl(9), lbl(3), lbl(6), lbl(7), lbl(8), o_, wo, lbl(10), lbl(11), rl.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the verified data stream for the subslice *o_worl* to the light encoding.

                                        Figure 7: Light Verifiable Slice Stream Example

                                        The verification data to send for hello_world with a chunk_size of two with the light verifiable stream format when requesting the slice o_worl. The label of vertex is not omitted, despite being omitted in non-slice requests with the light verifiable stream (compare Figure 5).

                                        lbl(9), lbl(3), lbl(6), o_, wo, lbl(11), rl.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the verified data stream for the subslice *o_worl* to the 1-grouped light encoding.

                                        Figure 8: 1-Grouped Light Verifiable Slice Stream Example

                                        The verification data to send for hello_world with a chunk_size of two with the 1-grouped light verifiable stream format when requesting the slice o_worl. The label of vertex is not omitted, despite being omitted in non-slice requests with the 1-grouped light verifiable stream (compare Figure 6).

                                        In a system where clients can request slices, it stands to reason they might request several (non-overlapping) slices within the same string. Such non-overlapping slices have an overlap in their verification metadata: the streams include the labels of vertices that lie on a path from included chunks to the root, and these paths overlap towards the root. The closer two slices are, the greater their overlap. In particular, let , , and be slices such that ends before starts, and ends before starts. Then the overlap between any path from the root to a leaf in the slice and any path from the root to a leaf in or is included in the overlap beween the path from the root to the first chunk of and the path from the root to the final chunk of . Likewise, the overlap between any path from the root to a leaf in the slice and any path from the root to a leaf in or is included in the overlap beween the path from the root to the final chunk of and the path from the root to the first chunk of .

                                        For the client to tell the server which parts of the verification metadata need not be included in a slice stream because the client already has that data, it hence suffices for the client to supply two integers between and the left_skip indicates to omit from the stream the labels of the children of the first left_skip vertices on the path from the root to the first chunk of the slice, and the right_skip indicates to omit from the stream the labels of the children of the first right_skip vertices on the path from the root to the final chunk of the slice. Figure 9 gives an example.

                                        lbl(7), lbl(8), o_, wo, rl.

                                        A visualization of the Merkle tree for the string *hello_world*, listing in each vertex the data that that vertex contributes to the verified data stream for the subslice *o_worl* when left_skip and right_skip are set to two.

                                        Figure 9: Slice Skipping Example

                                        Slice streaming with the baseline verifiable stream format, a left_skip of and a right_skip of

                                        5 Conclusion

                                        Peer-to-peer content-addressable-storage systems without verifiable streaming force peers to choose between two undesirable options: either they discard unbounded amounts of data upon connection failures, making it unlikely to ever receive large strings over unreliable connections, or they persist untrusted data, turning them into a free storage backend for malicious peers. Verifiable streaming reduces the length of maximal sequences of untrusted data to a constant, configurable amount, thus making it feasible to discard unverified data when a connection fails.

                                        Bab is an efficient and flexible family of hash functions that enable streaming verification of strings. Its WILLIAM3 instantiation is close to the popular BLAKE3 hash function. Unlike BLAKE3-based systems (i.e., Bao), Bab-based systems support constant-size length proofs for strings of known digest. We have presented optimization techniques for minimizing the metadata overhead of streaming verification that are applicable both to Bab and Bao.

                                        6 Acknowledgements

                                        Most ideas in this specifications are extrapolations from the BLAKE3 paper (O’Connor et al., 2020).

                                        The optimization of omitting the lowest layers of labels from a verification stream has been known to the Bao project since at least 2020. They use a different (but equivalent) conceptual framing of explicitly grouping chunks together. I first encountered the concept through this explanation by Rüdiger Klaehn as part of the implementation of Iroh blobs.

                                        I would like to thank Jack O'Connor (who authored Bao) for helpful feedback, most of which made it directly into the current specification.

                                        References