Importance of Custom State Databases in Blockchain

From: thepipeline_xyz

Developing a custom state database is crucial for achieving high performance in EVM (Ethereum Virtual Machine) implementations and addressing the specific challenges in blockchain system design [00:00:15] [00:00:17]. Standard databases like Pebble DB or RocksDB are often insufficient for the unique demands of blockchain environments [00:00:09].

Key Performance Bottlenecks in Blockchain

The most expensive operations in blockchain are:

Cryptography functions such as elliptical curve cryptography and hashing [00:00:24] [00:00:27].
State access [00:00:35].

In contrast, the business logic within most smart contracts is relatively “cheap” to execute compared to typical desktop or phone applications, meaning raw computation itself doesn’t offer significant gains [00:00:41] [00:00:55] [00:01:00]. While some clients already incorporate parallel signature recovery, which is an expensive part of transaction execution, there isn’t much more to gain in that area [00:01:05] [00:01:09] [00:01:17].

Instead, the primary bottleneck often lies in database interactions. A single read from an SSD can have a latency of 80 to 100 microseconds or more [00:01:38] [00:01:40] [00:01:45]. This is orders of magnitude longer than it takes to execute a simple smart contract [00:01:56] [00:01:59].

Inefficiencies of General-Purpose Databases

Blockchain clients often make multiple sequential database reads for a single transaction, such as:

Reading the sender’s account to check balance [00:02:07] [00:02:08].
Reading the destination account [00:02:11].
Reading proxy accounts [00:02:13] [00:02:15].
Reading storage slots (e.g., ERC20 token balances or Uniswap data) [00:02:17] [00:02:19] [00:02:22] [00:02:23] [00:02:25].

If these reads are not cached in main memory and must hit the disk, the cumulative latency can significantly prolong transaction execution time [00:02:30] [00:02:33] [00:02:36] [00:02:39] [00:02:41] [00:02:43] [00:02:45].

While one solution is to simply “throw RAM at it” by requiring expensive, large-memory servers to minimize disk reads [00:02:53] [00:02:54] [00:02:56], this approach does not address the fundamental inefficiencies.

Standard databases commonly used in blockchain clients, such as:

B+ tree databases: LMDB and MDBX (an LMDB derivative) [00:04:30] [00:04:33] [00:04:37].
LSM (Log-Structured Merge) trees: RocksDB and LevelDB [00:04:37] [00:04:40] [00:04:43].

These are general-purpose databases designed for average performance [00:04:50] [00:04:51] [00:04:54] [00:05:05]. A key issue with some implementations, like Geth, is embedding one data structure inside another on disk, leading to expensive double-traversal for every request [00:04:06] [00:04:09] [00:04:10] [00:04:13] [00:04:14] [00:04:18] [00:04:20] [00:04:22].

The Advantage of Customization

Modern SSDs are incredibly powerful, capable of 500,000 I/O operations per second [00:06:53] [00:06:54] [00:06:56] [00:06:58] [00:07:01]. However, this raw performance is wasted if the software isn’t optimized to leverage it [00:07:04] [00:07:07] [00:07:10]. Even an algorithm with better computational complexity will perform poorly if implemented inefficiently [00:07:12] [00:07:13] [00:07:16] [00:07:19].

This concept is well-understood in high-frequency trading (HFT), where standard libraries are avoided because custom data structures, tailored to the specific trading model, yield significantly better performance from the hardware [00:05:10] [00:05:12] [00:05:14] [00:05:17] [00:05:19] [00:05:20] [00:05:23] [00:05:26] [00:05:29].

Applying this principle to blockchain, a custom database like Manad DB is designed with precise knowledge of how data will be used and stored [00:05:32] [00:05:36] [00:05:40] [00:05:42]. This allows for “super optimization” to extract every last bit of performance [00:08:14] [00:08:16] [00:08:18].

For example, Manad DB might require only one or two requests to look up an account, whereas other general-purpose data structures could require 20 requests to the hardware if the data is not cached [00:07:42] [00:07:44] [00:07:47] [00:07:50] [00:07:59] [00:08:02] [00:08:06] [00:08:07] [00:08:09] [00:08:11]. This significant reduction in disk I/O requests leads to massive performance gains and is why custom state database development is critical for performant blockchain ecosystems [00:06:31] [00:06:34] [00:06:36].

The Pipeline Knowledge Graph

Explorer

Table of Contents

Importance of Custom State Databases in Blockchain

Key Performance Bottlenecks in Blockchain

Inefficiencies of General-Purpose Databases

The Advantage of Customization

Graph View

Backlinks