0%

How Alchemy Built the Fastest Archival Methods on Solana

Last updated: February 24, 20267 min read
A chart showing archival data on Solana

Towards the end of last year, we announced our brand new product offering for Solana, and one of the key features of that product is the performance of our archival methods: which are up to 20x faster than any other solution on the market.

With our Solana offering, developers can query historical transactions, blocks, and signatures at unprecedented speed via a standard JSON RPC interface. That wasn’t easy to build, and in this post we’ll break down how did it.

What are archival methods on Solana?

In Solana, archival methods refer to RPC calls that retrieve historical blockchain data, like getTransaction, getBlock, and getSignaturesForAddress.

These archival endpoints are the foundation of onchain apps. Wallets, indexers, analytics providers, explorers, and much more rely on this historical data to provide richer data to users and build coherent experiences.

However, archival access is notoriously difficult to fetch on Solana. The dataset is massive, deeply interconnected, and computationally heavy to query. For most developers, these endpoints are where latency and inconsistency issues appear first and appear most acutely.

The performance problem with Solana’s archival methods

Most Solana infrastructure today runs on Google Bigtable coupled with validator-based RPC nodes.

This stack is easy to deploy (which is why many teams choose it), but it’s not built for scale. It’s CPU-intensive, memory-hungry, and struggles with large batch requests. When querying at scale, these methods can throttle throughput and slow your app to a crawl, or worse even drop data altogether in the response.

This is why even reputable providers often return incomplete or lagging responses when you query historical data. For devs, those missing blocks and slow calls quickly cascade into broken in-app experiences.

We wanted to fix that. Not simply with more hardware (though we added more of that too), but with a better architecture for our infra altogether.

Our optimization struggles with Google Bigtable

Like many Solana infra providers, we started with Google Bigtable, and over time we discovered the performance issues above. Those issues were compounded by two factors.

First and foremost, the large size of Solana blocks. Bigtable performs best when rows are ~1KB. Solana transaction data or account states are often much larger. Fetching a large binary object requires more I/O from Google’s underlying storage (Colossus).

Second, there was a question around price. Cloud databases are expensive. A lot of data on Solana (like voting transactions) is not particularly valuable, and it’s economically not feasible to maintain that data in the cloud. Also Google Bigtable is priced by usage, which adds even more pressure and less flexibility to offer higher RPS.

Despite those limitations, we still tried to push some optimizations to this setup. For example, we optimized transaction fetching. In Google Bigtable’s default setup, when you request one transaction with getTransaction, it fires one query to a tx table to block ID by signature. Then it fetches all blocks to return a single transaction. That's a lot of work to return a tiny amount of data from a remote cloud database.

To make that common request more efficient, we introduced an additional table called tx-full where we store not just a reference to blocks, but the whole transaction. That made a big difference on response times, but we were still suffering from increased latency due to the remote geographic location of the data center.

Even though we tried hard to accommodate the default stack, development was very slow. Just a simple write done from a full RPC node/validator was a big pain: every change required full restarts of nodes. That process could take anywhere from 30 minutes to a few hours. Not having the database closer geographically or having the ability to control some parameters of it also made everything much harder and more expensive. After some effort, we decided it would be better to rewrite the entire stack, from data ingestion to the RPC server itself.

Our solution? A rewritten stack

Our main goal when taking on this work was to increase development time. This meant no heavy RPC nodes that write and serve data. Instead, we separated writing and serving data into small separate services and switched to a self-hosted, open source database.

That new system offers:

  • Hardware and software co-optimization: Every layer of the stack, from disk layout to memory access, is tuned to maximize throughput and minimize compute overhead.

  • Multi-region data distribution: Archive data is globally distributed, ensuring low-latency access no matter where requests originate.

  • Triple-verified ingestion: Each record is written twice, validated programmatically for accuracy, and continuously scanned for completeness.

  • Self-healing pipelines: If discrepancies are detected, we automatically re-ingest missing entries, cross-checking up to 30–50 related addresses per block to guarantee data integrity.

Let’s go through each part of the solution, which we also made open source (this code exists in the DexterLab’s repo, which was acquired by Alchemy last year).

ArchivalRPC

We rebuilt our archival service from scratch into a much more performant service that can be started or restarted in few seconds. A dramatic improvement from the 30 minutes to several hour spin up time we dealt with when working with a full RPC node.

We still use the same official Solana libraries to serialize and deserialize data, so this service will always be compatible with ongoing changes. This means that even with our new custom stack, we are not changing data at all, ensuring that the data we serve is not malformed during processing.

And being a light service that is dependent on a database, it’s very easy to scale on demand (in only a few seconds) and accommodate massive amount of traffic. This is particularly useful for spiky applications, and with a single DB instance, this service can reach numbers up to 100-200k RPS depending on the database setup.

ArchivalRPC Ingestor

This part of the stack is responsible for pulling Solana data from multiple sources and writing it into our database layer. It handles both bulk imports (such as raw gzip block archives used when bootstrapping new instances) and real-time streams from systems like Kafka.

We also built granular controls directly into the ingestor, which allow us to choose exactly which data types an instance should store (e.g. blocks only, signatures only, transactions only, or any combination). This selective-ingestion model reduces single points of failure, improves reliability, and gives us meaningful levers to optimize storage costs depending on the use case.

HBase

Once we improved read and write performance, we turned our attention to the next challenge: cloud database latency. ArchivalRPC was designed from day one to support multiple storage backends, which made it straightforward to explore alternatives.

We ultimately chose HBase, an open-source, self-hosted replacement for BigTable, because it let us co-locate the database alongside our RPC servers and drive storage latency effectively to zero. This implementation also enabled us to use our existing data models and implementation principles, giving us the performance benefits without introducing architectural risks.

But even with the strengths of HBase, we still ran into some issues. For example, to handle the large size of Solana blocks, we originally tried to use BucketCache in Hbase, which can be configured to use off-heap memory or fast SSDs as a secondary cache layer and keep more data in a “warm” state.

However, we ultimately found that BucketCache was unreliable. Because of the randomness of data and how quickly it changed, it slowed down our system. Instead, we ultimately turned to a different solution for handling large data: we actually changed how blocks are stored. Instead of storing full blobs in our database (which isn’t designed to handle these large data structures), we only store the metadata in the database, which points to CAR files which can then reside on any storage and are very scalable.

Service optimizations and multi-region distribution

Building a solution that can scale to millions, and even billions of requests, required optimizations across the entire stack: swapping out one database architecture for another alone wasn’t enough to get the performance we were looking for. It required us to try many different deployments on different hardware and try different architectures. Over time, we improved our ingestion and serving services and got the desired results.

Optimizing our hardware and software to handle high volumes of requests was one challenge we had to solve. Another was fighting latency. For heavy archival calls, we found getting latency under control required expanding our hardware into multiple regions to ensure consistent fast responses around the globe.

This multi-region setup also improves our availability and helps us maintain higher reliability standards. Outages are inevitable, and having fallback options span geographic regions ensures that even if one region goes down, we can continue to serve our customers. This same multi-region configuration also helps us scale and distribute traffic: if one region is reaching capacity, we can route traffic to another region as we scale the deployment.

Triple-verified ingestion

A lot of the work for this product was improving performance, but that’s only half the battle. One ongoing issue with Solana archival methods is that data often gets dropped altogether. We had to improve the reliability of the system.

Many Bigtable-based providers silently miss blocks, signatures, or transactions as their ingestion pipelines scale. Those misses happen due to how default data ingestion works from a validator: every restart can cause gaps in data writes.

And if you are locked into maintaining multiple validators writing the same data, that still does not guarantee data consistency because there are no secondary checks that the data is being written correctly. And as a builder using that infra, you have no idea that data is missing until something breaks downstream.

Our infrastructure prevents that problem by design. After writing data, we run secondary pipelines to read every record and continuously validate that data and automatically repair it if inconsistencies appear. We also do full checks of our databases after initial bootstrap, and we continue to do checks on live writes in close to real time.

We didn’t just build faster data. We built trusted data.

Self-healing pipelines

For some complex data, like signatures, fallback methods can’t be trusted. This means the above triple-verification process is sometimes still not enough.

To combat this, we use programmatic validation to cross-check transactions across up to 30-50 addresses per block. If our system detects a discrepancy, our backup and recovery process automatically repairs the missing entries.

That process involves fetching a block that serves as the source of truth and extracting all address-signature relationships from it. To optimize the cost of scanning HBase here, we accumulate these mappings per address over a large range of consecutive slots and perform a scan query instead of individual key-based queries. The results are then compared, which enables us to identify gaps in the SIGS cluster.

Once we identify those gaps, we repost that block to the indexing pipeline (to a suitable kafka topic for this data type) and reindex it. We also added a consistency checker process that will repeat this operation if the gap isn’t filled, and eventually alert our team if the failure state continues.

This redundancy and self-healing ensures that you can rely on our methods as the most complete and consistent source of historical data on Solana.

Translating the work to throughput

All of that work has translated to meaningful capacity:

  • 100,000 RPS per region for getTransaction

  • 50,000 RPS per region for getSignaturesForAddress

  • 2,000 RPS per region for getBlock

We are continuing to invest in our Solana offering and will be expanding to more regions soon for even better reliability and latency. This is just the beginning.

Try It Yourself

If you want to try our archival methods on Solana, you can get started for free by creating a dashboard account to generate your API key. Then you can explore our Solana endpoints in our documentation.

Alchemy Newsletter

Be the first to know about releases

Sign up for our newsletter

Get the latest product updates and resources from Alchemy

A
O
D
+
Over 80,000 subscribers

By entering your email address, you agree to receive our marketing communications and product updates. You acknowledge that Alchemy processes the information we receive in accordance with our Privacy Notice. You can unsubscribe anytime.