Importance of custom databases for EVM optimization

From: thepipeline_xyz

Optimizing the performance of an EVM client requires a deep understanding of where processing time is actually spent within a blockchain system [01:28:00]. While common assumptions might point to complex computation, the most expensive parts of blockchain operations are typically cryptography functions (like elliptical curve cryptography and hashing) and state access [00:22:00].

Limitations of General Computation Optimization

Unlike desktop or phone applications, modern blockchains generally do not involve extensive computation within their smart contract logic [00:41:00]. As such, simply optimizing computation, including using parallel EVM or parallel signature recovery, yields limited gains [00:57:00]. Even throwing multiple cores at the problem, which is a trivially parallel problem, doesn’t significantly enhance overall blockchain performance [01:20:00].

The Bottleneck: State Database Access

The primary bottleneck in EVM performance is database access, particularly reading from storage [01:35:00]. A single read from an SSD can introduce a latency of 80 to 100 microseconds or more, depending on the SSD model [01:38:00]. This is orders of magnitude longer than the time it takes to execute a simple smart contract [01:56:00].

Executing a single transaction often requires multiple sequential reads from the database [02:02:00]. For instance, processing a transaction requires reading the sender’s account balance, the destination account, any proxy accounts, and specific storage slots if it’s an ERC-20 token or a Uniswap transaction [02:07:00]. If these reads are not cached in main memory and occur sequentially, the cumulative latency results in significant execution time for a single transaction [02:30:00]. While one solution is to equip every node with very large, expensive RAM to prevent disk reads, this is not an economically viable or scalable approach [02:53:00].

Challenges with Standard Databases

Standard databases like Pebble DB, Rocks DB, LMDB (and its derivative MDBX), and Level DB are general-purpose databases [04:09:00]. These databases are designed for average performance across a wide range of applications [05:05:00]. However, when used in blockchain clients, their performance can be “terrible” compared to the raw capabilities of modern SSDs [03:44:00].

One issue is that they often embed one data structure within another on disk, leading to expensive double-traversal operations for every request [04:06:00]. Furthermore, many existing blockchain clients using these databases make an excessive number of requests for basic lookups [07:39:00].

The Case for Custom Databases

To achieve high blockchain performance, especially in the context of an EVM, it is crucial to use custom databases tailored to the specific needs of blockchain data storage and access [05:42:00]. This approach is analogous to practices in high-frequency trading (HFT), where standard libraries and data structures are avoided in favor of customized solutions that extract maximum performance from hardware [05:10:00].

Modern SSDs are incredibly powerful, capable of 500,000 I/O operations per second [06:54:00]. A custom database like Monad DB is designed to fully leverage this raw performance by knowing exactly how data is stored and accessed [05:31:00]. This allows for unique database optimizations that drastically reduce the number of requests to the hardware [07:47:00]. For example, Monad DB might make only one or two requests to look up an account, compared to twenty requests by other data structures not in cache [07:59:00]. This “super optimization” extracts every last bit of performance from the hardware [08:14:00].

The Pipeline Knowledge Graph

Explorer

Table of Contents

Importance of custom databases for EVM optimization

Limitations of General Computation Optimization

The Bottleneck: State Database Access

Challenges with Standard Databases

The Case for Custom Databases

Graph View

Backlinks