From: thepipeline_xyz

Optimizing database performance is crucial for high-performance blockchains, especially when dealing with the high demands of the Ethereum Virtual Machine (EVM) [00:00:15]. While computations in smart contracts are often relatively cheap [00:00:44], and parallel processing for elements like signature recovery can be trivially parallel [00:01:24], the most expensive parts of blockchain operations are cryptography functions (like elliptical curve cryptography and hashing) and, significantly, state access from the database [00:00:35].

The Database Bottleneck

Profiling code reveals that a significant amount of time is spent on database operations [00:01:30]. A single read from an SSD can have a latency of 80 to 100 microseconds or more [00:01:40]. This is orders of magnitude longer than it takes to execute a simple smart contract [00:01:56].

A typical transaction involves multiple sequential reads:

  • Reading the sender’s account to check their balance [00:02:07].
  • Reading the destination account [00:02:11].
  • Reading proxy accounts [00:02:13].
  • Accessing storage slots, such as balances for ERC20 tokens or data for Uniswap [00:02:17].

If these reads are not cached in main memory, their latencies sum up, leading to considerable time for a single transaction [00:02:30].

The Potential of Modern SSDs

Modern SSDs are “amazing hardware” [00:03:32], offering incredible performance, such as 500,000 I/O operations per second (IOPS) [00:06:58]. They represent a significant leap from older spinning disk hard drives, which required sequential reads due to their physical mechanics [00:06:01]. There’s a lot of untapped “juice packed into SSDs” [00:06:28].

Why Standard Databases Fall Short

Despite the raw performance of SSDs, standard databases like Pebble DB or Rock DB often lead to “terrible performance” when used for blockchain clients [00:03:51].

The issues with general-purpose databases include:

  • Layered data structures: They often embed one data structure inside another, leading to expensive double-traversal for every request [00:04:06].
  • General-purpose design: Databases like B+ tree (lmdb, mdbx) and LSM trees (RocksDB, LevelDB) are designed to be “performant on average” for general applications [00:04:47]. They are not tailored for specific access patterns.
  • Excessive requests: Some blockchain clients using these databases make “so many requests just to look up something basic” [00:07:39].

Relying solely on hardware improvements without software optimization is a mistake [00:07:30]. An algorithm with better computational complexity but poorer implementation can still perform worse [00:07:12].

The Solution: Custom Database Optimization

To truly leverage the capabilities of SSDs and achieve high-performance blockchains, custom databases are essential [00:00:17]. This approach is common in fields like high-frequency trading (HFT), where standard libraries are avoided in favor of customized data structures to extract maximum performance [00:05:10].

By knowing exactly how data will be used and stored, a custom database can be designed to perform operations in the most efficient way [00:05:32]. For example, Monad DB might make only one or two requests to look up an account, compared to 20 requests by other data structures not in cache [00:07:59]. This “super optimization” is key to extracting every last bit of performance from the hardware [00:08:14].