From: thepipeline_xyz

Developing a performant EVM (Ethereum Virtual Machine) and overall blockchain system requires careful consideration of its underlying database, as standard database implementations often introduce significant performance bottlenecks [00:00:07]. While computational logic within smart contracts is relatively inexpensive to execute compared to typical desktop or phone applications [00:00:41], other aspects of blockchain operations present major performance challenges [00:00:57].

Core Performance Bottlenecks

The most expensive components in blockchain operations include [00:00:22]:

  • Cryptography Functions: This includes electrical curve cryptography and hashing functions [00:00:27].
  • State Access: Accessing the blockchain’s state data is a significant performance drain [00:00:35].
  • Signature Recovery: Parallel signature recovery, though already implemented in some clients, remains an expensive part of transaction execution [00:01:05].

Optimization of raw computation alone offers limited gains in modern blockchains [00:00:57]. Instead, the focus shifts to areas like state access.

Database Latency and State Access

Profiling blockchain code reveals that a substantial amount of time is spent on database interactions [00:01:30]. A single read from an SSD can have a latency of 80 to 100 microseconds or more, depending on the SSD model and generation [00:01:38]. This latency is orders of magnitude longer than it takes to execute a simple smart contract [00:01:56].

Executing a single transaction often requires multiple sequential database reads [00:02:02]:

  • Reading the sender’s account to check their balance [00:02:07].
  • Reading the destination account [00:02:11].
  • Reading proxy accounts [00:02:13].
  • Reading storage slots, which is where data like ERC-20 balances or Uniswap data are stored [00:02:17].

When these sequential reads occur without being cached in main memory, the cumulative latency results in significant transaction execution times [00:02:30]. While increasing RAM to cache all data can mitigate this, it leads to very expensive hardware requirements, which is not an optimal solution for broad adoption [00:02:53].

Limitations of Standard Databases

General-purpose databases, such as Pebble DB or RocksDB, which are often used as standard implementations for blockchain clients, present significant performance issues [00:00:09]. These databases include [00:04:28]:

  • B+ tree databases: LMDB and its derivative MDBX [00:04:30].
  • LSM (Log-Structured Merge) trees: RocksDB (a derivative of LevelDB, which was the first) [00:04:37].

The fundamental problem with these options is their “general-purpose” nature [00:04:50]. They are designed for average performance across a wide range of applications, not for highly specialized and performant use cases [00:05:02].

Specific issues include:

  • Embedded Data Structures: Embedding one data structure inside another that is stored on disk leads to highly expensive operations, as each request traverses two data structures [00:04:06].
  • Inefficient Use of Hardware: Despite the impressive capabilities of modern SSDs (e.g., 500,000 I/O operations per second for some hosts [00:06:58]), general-purpose databases fail to leverage this raw performance [00:03:44].
  • Excessive Requests: Standard blockchain clients using these databases can make an unnecessarily high number of requests (e.g., 20 requests) just to look up basic information [00:07:41].

This phenomenon is similar to High-Frequency Trading (HFT), where standard libraries or general data structures are avoided because customizing the data structure to the specific trading model yields significantly better performance from the hardware [00:05:10].

The Solution: Custom Database Optimization

To overcome these challenges and achieve high performance, a custom database approach is necessary [00:05:31]. By understanding exactly how the data needs to be used and stored, developers can implement a database optimized for specific blockchain requirements [00:05:36].

For instance, Monad DB was developed to extract every last bit of performance from the hardware [00:08:16]. It achieves this by significantly reducing the number of requests made to the hardware. While a typical data structure might make 20 requests to look up an account, Monad DB can achieve the same lookup with just one or two requests [00:07:59]. This level of super optimization is crucial for maximizing throughput and efficiency in blockchain systems [00:08:14].

This customized approach is a key part of the evolution towards more performant and scalable blockchain systems.