Challenges of standard databases in blockchain performance

From: thepipeline_xyz

Developing a performant EVM (Ethereum Virtual Machine) and overall blockchain system requires careful consideration of its underlying database, as standard database implementations often introduce significant performance bottlenecks [00:00:07]. While computational logic within smart contracts is relatively inexpensive to execute compared to typical desktop or phone applications [00:00:41], other aspects of blockchain operations present major performance challenges [00:00:57].

Core Performance Bottlenecks

The most expensive components in blockchain operations include [00:00:22]:

Cryptography Functions: This includes electrical curve cryptography and hashing functions [00:00:27].
State Access: Accessing the blockchain’s state data is a significant performance drain [00:00:35].
Signature Recovery: Parallel signature recovery, though already implemented in some clients, remains an expensive part of transaction execution [00:01:05].

Optimization of raw computation alone offers limited gains in modern blockchains [00:00:57]. Instead, the focus shifts to areas like state access.

Database Latency and State Access

Profiling blockchain code reveals that a substantial amount of time is spent on database interactions [00:01:30]. A single read from an SSD can have a latency of 80 to 100 microseconds or more, depending on the SSD model and generation [00:01:38]. This latency is orders of magnitude longer than it takes to execute a simple smart contract [00:01:56].

Executing a single transaction often requires multiple sequential database reads [00:02:02]:

Reading the sender’s account to check their balance [00:02:07].
Reading the destination account [00:02:11].
Reading proxy accounts [00:02:13].
Reading storage slots, which is where data like ERC-20 balances or Uniswap data are stored [00:02:17].

When these sequential reads occur without being cached in main memory, the cumulative latency results in significant transaction execution times [00:02:30]. While increasing RAM to cache all data can mitigate this, it leads to very expensive hardware requirements, which is not an optimal solution for broad adoption [00:02:53].

Limitations of Standard Databases

General-purpose databases, such as Pebble DB or RocksDB, which are often used as standard implementations for blockchain clients, present significant performance issues [00:00:09]. These databases include [00:04:28]:

B+ tree databases: LMDB and its derivative MDBX [00:04:30].
LSM (Log-Structured Merge) trees: RocksDB (a derivative of LevelDB, which was the first) [00:04:37].

The fundamental problem with these options is their “general-purpose” nature [00:04:50]. They are designed for average performance across a wide range of applications, not for highly specialized and performant use cases [00:05:02].

Specific issues include:

Embedded Data Structures: Embedding one data structure inside another that is stored on disk leads to highly expensive operations, as each request traverses two data structures [00:04:06].
Inefficient Use of Hardware: Despite the impressive capabilities of modern SSDs (e.g., 500,000 I/O operations per second for some hosts [00:06:58]), general-purpose databases fail to leverage this raw performance [00:03:44].
Excessive Requests: Standard blockchain clients using these databases can make an unnecessarily high number of requests (e.g., 20 requests) just to look up basic information [00:07:41].

This phenomenon is similar to High-Frequency Trading (HFT), where standard libraries or general data structures are avoided because customizing the data structure to the specific trading model yields significantly better performance from the hardware [00:05:10].

The Solution: Custom Database Optimization

To overcome these challenges and achieve high performance, a custom database approach is necessary [00:05:31]. By understanding exactly how the data needs to be used and stored, developers can implement a database optimized for specific blockchain requirements [00:05:36].

For instance, Monad DB was developed to extract every last bit of performance from the hardware [00:08:16]. It achieves this by significantly reducing the number of requests made to the hardware. While a typical data structure might make 20 requests to look up an account, Monad DB can achieve the same lookup with just one or two requests [00:07:59]. This level of super optimization is crucial for maximizing throughput and efficiency in blockchain systems [00:08:14].

This customized approach is a key part of the evolution towards more performant and scalable blockchain systems.

The Pipeline Knowledge Graph

Explorer

Table of Contents

Challenges of standard databases in blockchain performance

Core Performance Bottlenecks

Database Latency and State Access

Limitations of Standard Databases

The Solution: Custom Database Optimization

Graph View

Backlinks