Storage
Introduction
β’ The persistence module of Canopy defines a Store
interface for interacting with blockchain storage.
The Store struct is a high-level abstraction layer built on top of a single Key-Value database, providing 3 main components for managing blockchain-related data:
State Store: Maintain all versions of the state (ledger) for every height of the blockchain.
CanoTrie: Efficiently verification of the entire state contents for any given version (height).
Indexer: Maintain non-state data like Transactions, Blocks, and Quorum Certificates.
Under the hood, the store uses BadgerDB in Managed Mode to keep historical versions of the state, enabling "time-travel" operations and historical queries.
βͺ All writes to the State Store, CanoTrie, and the Indexer happen inside a single BadgerDB batch, making sure theyβre committed atomically β one safe operation per block height
State Store
The State Store is where the actual data blobs β like Accounts and Validators β are stored to represent the blockchainβs state.
β’ It serves as the primary storage layer, organized into two parts: 'historical' partitions and 'latest' data.
This split makes it easy to quickly access snapshots, efficiently iterate over data, and safely prune older state without slowing down the system.
type StateStore interface {
// set value referenced by key
Set(key, value []byte) error
// delete key value
Delete(key []byte) error
// access value bytes using key bytes
Get(key []byte) ([]byte, error)
// iterate through the data one KV pair at a time in lexicographical order
Iterator(prefix []byte) (Iterator, error)
// iterate through the date on KV pair at a time in reverse lexicographical order
RevIterator(prefix []byte) (Iterator, error)
}
Schema
By utilizing a Key Value implementation, the database does not support schemas like a relational database engine such as SQL.
However, logical groupings and efficient iteration may be enabled through by utilizing prefix keys.
var (
accountPrefix = []byte{1} // prefix for accounts
poolPrefix = []byte{2} // prefix for pools
validatorPrefix = []byte{3} // prefix for validators
committeePrefix = []byte{4} // prefix for validators in committees
)
By prepending succinct byte
labels, a Key Value implementation is transformed into an organized architecture:
func KeyForAccount(address []byte) []byte {
return accountPrefix + address
}
βͺ And by leveraging the lexicographical ordering of the database implementation, deterministic optimizations may be made on numbers using BigEndianEncoding
.
func KeyForCommittee(chainId uint64, addr []byte, stake uint64) []byte {
return CommitteePrefix(chainId)+ formatUint64(stake) + addr)
}
Thus, when retrieving a group, the keys are already ordered logically.
func (s *StateMachine) GetCommitteeMembers(chainId uint64) (members Members) {
// get committee members from the highest stake to lowest
it := s.RevIterator(CommitteePrefix(chainId))
// for each item of the iterator up to MaxCommitteeSize
for i:=0 ; it.Valid() && i < MaxCommitteeSize; it.Next(); i++ {
member = addressFromKey(it.Key())
members = addMember(members, member)
}
}
CanoTrie
CanoTrie is an optimized Sparse Merkle Trie (SMT) designed for key-value storage. It combines properties of prefix trees and Merkle trees to efficiently handle sparse datasets and cryptographic integrity.
β’ At a high level, CanoTrie is optimized for blockchain operations, allowing efficient proof of existence
within the tree and enabling the creation of a single 'root' hash. This root hash represents the entire state, facilitating easy verification by other nodes to ensure consistency between a peer's states.
CanoTrie design:
Sparse Structure: Keys are organized by their binary representation, with internal nodes storing common prefixes to reduce redundant paths
Merkle Hashing: Each node stores a hash derived from its children, enabling cryptographic proofs for efficient verification of data integrity
Optimized Traversals: Operations like insertion, deletion, and lookup focus only on the relevant parts of the tree, minimizing unnecessary traversal of empty nodes
Key-Value Operations: Supports upserts and deletions by dynamically creating or removing nodes while maintaining the Merkle tree structure
Optimizations over standard SMT:
Any leaf nodes without values are set to nil. A parent node is also nil if both children are nil
If a parent has exactly one non-nil child, replace the parent with the non-nil child
A tree always starts with two children: (0x0...) and (FxF...), and a Root
Basic Algorithm:
Tree Traversal
Navigate down the tree to set current to the closest node based on the binary of the target key
Operation
Upsert (Insert or Update)
If the target already matches current: Update the existing node
Otherwise: Create a new node to represent the parent of the target node and current
Replace the pointer to current within its old parent with the new parent
Assign the current node and the target as children of the new parent
Delete
If the target matches current:
Delete current
Replace the pointer to current's parent within current's grandparent with the current's sibling
Delete current's parent node
ReHash
Update hash values for all ancestor nodes of the modified node, moving upward until the root
Example 1: Inserting 1101
Before Insert (Key = 1101):
root
/ \
0000 1
/ \
1000 111
/ \
1110 1111
After Insert (Inserted 1101 and 11):
root
/ \
0000 1
/ \
1000 *11*
/ \
*1101* 111
/ \
1110 1111
Example 2: Deleting 010
Before Delete (Key = 010):
root
/ \
*0* 1
/ \ / \
000 *010* 101 111
After Delete (Deleted 010 and compressed the tree):
root
/ \
000 1
/ \
101 111
Operations
When writing state data in the Store, 3 write operations are executed:
Latest State Store:
{ Key , Value }
Historical State Store:
{ Key , Value }
CanoTrie:
{ Hash(Key) , Hash(Value) }
func (s *Store) Set(k, v []byte) error {
// set in the state store @ latest
if err := s.lss.Set(k, v); err != nil {
return err
}
// set in the state store @ historical partition
if err := s.hss.Set(k, v); err != nil {
return err
}
// set in canotrie
return s.sc.Set(k, v)
}
Though there's an I/O tradeoff for writes, this implementation enables efficiency in the following operations:
Partition
Every few days, the Store executes partitioning logic:
Snapshot: At each partition boundary, copy the latest version of every key from the State Store into a new Historical Partition. This ensures that each Historical Partition contains a complete, independent snapshot of state at that height, allowing older partitions to be safely pruned without dependency risk.
Prune: Remove all versions in the Latest State Store that are older than the partition boundary, retaining only the newest versions necessary for live state execution.
Benefits of this design:
β Fast State Iteration: The State Store remains lean with minimal version history per key, dramatically improving iteration speed and reducing read amplification.
β Efficient Historical Access: Historical data is chunked into partitions, enabling faster point-in-time lookups and smaller, more targeted iterator ranges.
β No Reverse Seeks: Thanks to managed timestamps and partitioned snapshots, queries avoid expensive reverse scans.
β Organized Historical Hierarchies: Historical partitions are logically structured, making archival, backup, and pruning operations predictable and scalable.
Indexer
This component indexes important non-ledger data such as Quorum Certificates, Blocks, and Transactions.
β’The indexer is designed to allow efficient querying and retrieval, which is crucial for blockchain operations β where large value blobs (like Blocks or Transactions) are stored only once, while lightweight indexes are maintained separately.
This approach enables fast and flexible logical queries, such as:
Transaction by Hash
Transactions by Height
Transactions by Sender (address)
Transactions by Recipient (address)
Block by Hash
Block by Height
Quorum Certificate by Height
Double Signers by Height
Checkpoint by ChainID + Height
Mempool
The Mempool is a pre-block, in-memory store component that holds valid transactions that are pending to be included in a block.
β’ Canopy's default Mempool implementation ensures that only valid transactions are maintained in the pool by maintaining an ephemeral State Machine that applies the transactions as if the Mempool contents is going to be the next block included in the blockchain.
Pending transactions are validated at the Mempool level based on the current state and any transactions already ahead of them in the queue.
Importantly, the Mempool is ordered highest fee to lowest, with the exception of the 0 fee Certificate Result Transaction that requires a +β signature from an auto-subsidized Committee.
βͺ Mempool configurations allow users to limit their mempool by number of bytes and number of transactions β if either of these limits are exceeded, a certain percent of the lowest fee transactions are dropped by the pool.
Last updated