Data proof

Maru data proof via zk-STARKs enables efficient verification of any receipts, leafs, nodes, roots, super root that included in Patricia Merkle Trie and connect with other STARKs while proving using CTL's.

Circuit

Private inputs include data fields about each value:

  • input includes bytes of hashes or other data from MPT

  • offset includes the index of required element in input

Data operation has a structure, which consists of enum DataType that indicates what type of data in operation and DataItem, which contains additional data as offset in block and item - bytes array of hash or volume value:

#[derive(Clone, Debug, Copy)]
pub(crate) struct DataItem {
    pub(crate) offset: usize,
    pub(crate) offset_in_block: usize,
    pub(crate) item: [u8; KECCAK_DIGEST_BYTES],
}

#[derive(Debug, Copy, Clone, Eq, PartialEq)]
pub enum DataType {
    Leaf = 0,
    Node = 1,
    ReceiptsRoot = 3,
    BlockHash = 4,
    TotalSum = 5,
}

#[derive(Clone, Debug)]
pub(crate) struct DataOp {
    pub(crate) input: Vec<u8>,
    pub(crate) contract: Option<DataItem>,
    pub(crate) value: Option<DataItem>,
    pub(crate) child: Option<Vec<DataItem>>,
    pub(crate) external_child: Option<Vec<DataItem>>,
    pub(crate) data_type: DataType,
    pub(crate) method_signature: Option<DataItem>,
    pub(crate) receipts_root: Option<DataItem>,
    pub(crate) token_id: Option<DataItem>,
    pub(crate) pi_sum: Option<DataItem>
}

If we insert Leaf data type that doesn’t have child , then value of contract, value and method_signature must be included. When we insert Node, Root we have only input , DataType and child. To handle multiple transactions in one block, we partially "reconstruct" PMT to get one receipts root. Therefore, node will have multiple common childs.

For block linking we input only parent block hash as starting_blockhash, and the last block hash as ending_blockhash. In execution trace we have column structure:

  • already_absorbed_bytes value of input bytes that have already been absorbed prior to this block.

  • is_full_input_block will be 1 if this row represents a full input block, i.e. one in which each byte is an input byte, 0 otherwise.

  • len: value of the original input length (in bytes).

  • last_block is equal to 1 if is it the last block per data operation.

  • is_leaf will be 1 if this operation has leaf data type. 8

  • is_node will be 1 if this operation has node data type.

  • is_receipts_root will be 1 if this operation has root data type.

  • is_block_hash will be 1 if this operation has block hash data type (we use this type to prove that the last block hash in linking is correct).

  • is_shadow has value of 1 if in one block we have contract address, method signature, transfer value.

  • prefix_bytes contain the last 32 bytes of previous block or zero bytes if this the first row of data operation.

  • block_bytes contain the block being absorbed, which may contain input bytes and/or padding bytes.

  • contract_address_found will be 1 when contract address is present in block.

  • method_signature_found will be 1 when method signature is present in block.

  • transfer_value_found will be 1 when volume value is present in block.

  • child_hash_found will be 1 when child hash is present in block.

  • receipts_root_found will be 1 when root hash is present in block.

  • external_child_hash_found will be 1 when first parent hash of block is present in block.

  • sold_token_id_found will be 1 when token id value is present in block.

  • calculation_id: the ID of volume summation operation.

  • range_counter is used to store value of counter in table.

  • rc_cols are used to check bytes range for every value of block_bytes in the trace, namely the permutation of the column and the permutation of the range.

  • id: the ID of search operation. It will be identical in Data and Search tables.

  • offset_block equals to index indicating start of sub-array in input block with required data.

  • shift_num has value of cyclic shifts number required for search proof and respectively zero number of shifts.

  • zero_shift_num has zero value.

  • offset_object has value of index indicating start of sub-array in input.

  • typed_data has values of 32 bytes array containing hash, value, contract address or method signature.

Constraints

The Data Stark aims to constrain the following aspects:

  • Each flag (full-input block, final block or implied dummy flag) must be boolean.

  • Ensure that zero_shift_num is always zero.

  • Ensure that full-input block and final block flags are not set to 1 at the same time.

  • If this is a final block, the next row’s original sponge state should be 0 and already_absorbed_bytes = 0.

  • If this is a full-input block, the next row’s already_absorbed_bytes should be ours plus 136.

  • Verify that the offset of the previous row is greater than one corresponding to the next row.

  • Verify that if this a is shadow row the next rows have equal already_absorbed_bytes.

  • A dummy row is always followed by another dummy row, so the prover can’t put dummy rows ”in between” to avoid the above checks

  • Ensure that if we have more than 2 objects of 32 length in one block, is shadow will be 1 and next row has the same prefix, block_bytes as previous row

  • Ensure that if we have method_signature_found = 1 then typed_data have to be equal to hard-coded sell signature of Curve 3pool trade (temporarily).

  • Ensure that if we have contract_address_found = 1 then typed_data have to be equal to hard-coded contract address of Curve 3pool trade (temporarily).

  • Ensure that the distance between end of contract and start of method signature is equal to 3 (feature of location elements in Curve 3pool trade receipts).

  • Ensure that the distance between end of method signature and start of sold_token_id value is equal to 35 (feature of location elements in Curve 3pool trade receipts).

  • Ensure that the distance between end of sold_token_id and start of volume value is equal to 0(feature of location elements in Curve 3pool trade receipts).

The main idea of the proof relies on CTL, where we connect columns data from other STARKs:

  • ctl_data() Block bytes using filter when we have leaf, node, root data equal to concatenation columns of xored_rate_u32s, original_capacity_u32s, updated_state_u32s for input block.

  • ctl_keccak_sponge() Typed data of child hash have to be equal updated_state_bytes when it’s the final input block.

  • ctl_substr_in_haystack() The lookup guarantees that the concatenated columns of typed_data with zero_shift_num and id in Data table, using a row filter where child_hash_found, transfer_value_found, method_signature_found, contract_address_found, receipts_root_found, external_child_hash_found, sold_token_id_found, or total_sum = 1, equal to the concatenated columns of NEEDLE_REGISTER with OFFSET and ID in Search table using a filter where IS_OUT = 1.

  • ctl_haystack_equal_data() The lookup guarantees that the concatenated columns of prefix, block_bytes, id, and shift_num in Data table, using a filter where child_hash_found, transfer_value_found, method_signature_found, contract_address_found, receipts_root_found, external_child_hash_found, sold_token_id_found, or total_sum = 1, have the exact values of HAYSTACK_REGISTER in Search table.

  • ctl_last_blockhash_included() The lookup guarantees that first parent block hash and the last block hash in linking are equal to block hashes and total volume from public inputs.

  • ctl_volume_value() ensures that typed_data and CALCULATION_ID columns contain the exact values of VALUE and CALCULATION_ID in the Sum Stark.

  • ctl_token_id() ensures that typed_dataand CALCULATION_ID columns contain the exact values of TOKEN_ID and CALCULATION_ID in the Sum Stark.

Last updated