From: thepipeline_xyz

Monad DB is a custom state database designed from scratch to enable high-performance blockchain operations, specifically addressing the unique challenges of storing Ethereum state [04:51]. It is a fundamental component that allows Monad’s parallel execution to function effectively [14:29].

What is Monad DB?

Monad DB is a fully custom state database, a key part of Monad’s new software architecture [04:51]. Its primary purpose is to natively handle the Ethereum state, which is structured as a Merkle Patricia Trie (MPT), to ensure efficient data access and prevent bottlenecks in high-performance systems [14:48].

Overcoming State Access Bottlenecks

In blockchain systems, one of the four fundamental constraints is state access, which refers to retrieving data from the database storing the chain’s historical state [02:33]. For parallel execution to be effective, the database must be able to feed the execution engine continuously without delays [14:58]. If the system has to wait for data to be pulled from the database for each transaction, parallel execution becomes ineffective [15:12].

Monad DB addresses this by enabling parallel state access through asynchronous I/O [15:04]. This means it can issue many requests to the database simultaneously, and as those requests return, the execution engine can immediately process them [15:25].

Leveraging Modern Hardware

Modern hard drives, particularly SSDs, offer high bandwidth and millions of I/O operations per second, though individual lookups can take milliseconds [15:42]. Monad DB is designed to saturate this memory bandwidth by constantly pulling data out of state in parallel [16:09]. This ensures that other parts of the system, like the execution engine, are not bottlenecked waiting for data [16:15].

The Merkle Patricia Trie Challenge

Ethereum stores its state in a Merkel Patricia Trie (MPT), a tree structure that allows for succinct verification of transactions and state [16:32]. This means you can check the root hash to verify all data under that root [17:26]. This is a core philosophical reason for Ethereum’s design, enabling efficient verification without downloading and re-executing entire blocks [17:46].

However, traditional blockchain clients like Geth and Aragon typically use off-the-shelf databases (e.g., B-trees, LSM trees) to store this MPT [16:45]. These general-purpose databases are not natively structured as an MPT [16:56]. This creates an inefficiency: when traversing the logical MPT, the system has to perform multiple lookups within the underlying, differently structured database [19:10]. This “tree traversal within a tree traversal” can result in 16 to 32 times more overhead per lookup [19:20].

Monad DB’s Solution

Monad DB is a complete rebuild of a database where the Merkle Patricia Trie is the actual way data is stored on disk [19:42]. This eliminates the layer of indirection and the need for multiple lookups [19:50]. By natively structuring the MPT on disk, Monad DB significantly reduces the number of lookups required per traversal [20:10].

Combined with asynchronous I/O using kernel technology called IOU ring, Monad DB can spawn many parallel lookup threads [20:57]. These parallel lookups feed data into the parallel execution engine, ensuring that compute resources are constantly saturated [21:08].

Impact on Performance

This unique database optimization is the “secret sauce” that makes parallel execution truly work [21:19]. By providing parallel state access, Monad DB allows the system to achieve its high throughput goals, enabling higher transaction processing and lower fees by fully utilizing hardware capabilities [21:12].