Zero downtime, zero gaps: how we built fault-tolerant Solana gRPC streaming

Last updated: May 6, 20264 min read

Solana gRPC deep dive — fault-tolerant streaming banner

For teams building trading infrastructure, MEV searchers, liquidation bots, and real-time indexers, their products are only as fast as the data feeding them. A stream that spikes or drops means a missed trade or an incomplete index.

That's why we built our Solana gRPC streaming product on a multi-node redundancy layer, so spikes and drop-offs get absorbed before they hit your pipeline, and you don't have to engineer failover logic on top of a service that's supposed to be managed. This is how we architected it.

Why gRPC matters on Solana

Solana's data output is massive: over 4 petabytes annually at peak speeds. Traditional polling over JSON-RPC introduces unnecessary overhead: repeated requests, wasted bandwidth, and inherent latency from the request-response cycle. For workloads that need every account update or transaction the moment it's confirmed, polling doesn't cut it.

gRPC streaming over Protobuf/HTTP2 inverts the model. The server pushes data to the client as it's produced. Builders subscribe to exactly the data they need — specific accounts, programs, transaction patterns — and receive a continuous, filtered stream with zero polling latency. For Solana, where block times are sub-second and data volume is enormous, this is the ideal solution for apps that prioritize delivering the most current data to users.

The Yellowstone gRPC interface has become the de facto standard for Solana streaming. Builders have existing client libraries in Rust, TypeScript, Go, and anything that compiles a .proto file. We leveraged that standard, so migration from any Yellowstone-compatible provider is a URL change, not a rewrite.

The architecture: from shreds to streams

Our streaming pipeline has four layers, each designed with built-in redundancy for reliable performance at scale.

Shred ingestion

Shreds are the fundamental unit of data propagation on Solana — MTU-sized frames (~1,280 bytes) containing fragments of block entries. The network broadcasts shreds via Turbine, a multi-hop protocol where validators relay data through layered neighborhoods. Turbine is fast, but it's probabilistic. Packet loss accumulates across hops, and a node deep in the tree can wait hundreds of milliseconds for gap repair before assembling a complete block.

We ingest shreds from multiple independent sources simultaneously — including services that collect shreds directly from high-stake validators, bypassing Turbine's multi-hop latency. Each source covers different geographic paths and validator neighborhoods. Gaps in one stream are filled by another, which means faster block assembly with fewer repair cycles.

This matters because a node needs a minimum number of shreds to assemble each slot. If it doesn't receive enough within roughly 250ms, it falls back to the repair protocol — requesting missing shreds from peers, which adds significant latency. Getting enough shreds before repair kicks in is critical. That's why we ingest from multiple independent sources: more coverage means the node hits that threshold faster, assembles entries sooner, and fires Geyser notifications earlier. Every millisecond saved here cascades downstream.

The Richat engine

At the core of our streaming service is Richat, an open-source engine that runs as a Geyser plugin on the Solana node. It captures account updates, transaction notifications, slot status, block metadata, and entry updates as they're produced by the Agave client, then streams them to a Richat server that acts as the aggregation and fan-out layer.

The key design decision: each Richat server connects to multiple upstream Solana nodes simultaneously. It deduplicates updates across sources and delivers the fastest healthy result downstream. When one upstream node falls behind on a slot, the aggregation layer serves data from the next-fastest node.

This is what gives us latency spike attenuation. Single-node deployments have a long tail on p99 and p999 latency because every bad moment on the upstream node passes through to every consumer. Multi-node aggregation flattens that tail — the builder's stream reflects the best-performing node at any given moment, not the only one.

The builder's stream

Builders connect with standard Yellowstone client libraries. Once connected, they can subscribe to account updates (with program, owner, and data-slice filters), transaction updates (with account-include/exclude and signature filters), and slot, block, block-meta, and entry updates. Commitment levels (processed, confirmed, finalized), Zstd compression, and server-side keepalive are all supported natively.

Multiple concurrent subscriptions can be composed with AND/OR semantics on a single connection. Bidirectional streaming lets builders subscribe, modify, and cancel without reconnecting.

Replay on reconnect

Brief disconnections and client restarts are inevitable. The question is whether they cost the builder data.

We store a rolling window of recent updates in Richat's ring buffer. When a client reconnects, it can subscribe with a from_slot parameter and backfill everything it missed before picking up the live stream. No separate backfill pipeline. No gap-detection logic to maintain. No reconciliation service running alongside the consumer.

This is a simpler operational model for builders who need gap-free streams over long time horizons — indexers, analytics platforms, and anyone running a stateful consumer that can't afford to miss an update.

What's next

The current architecture is the foundation, not the ceiling. We're actively working on:

Extended replay window. Growing the reconnect backfill window significantly for builders who need longer recovery coverage.

Hardware and network optimizations. Ongoing infrastructure improvements to reduce latency across the critical path.

Additional regions. Expanding coverage to bring streams closer to more builders and more validators.

Dedicated deployments. For the small number of teams where co-location and isolation are requirements, not preferences.

Get started

Alchemy Solana gRPC is available today on all paid plans starting at $75/TB with no plan gate and no monthly minimum. If you're already running a Yellowstone client, migration is a URL change.

Read the docs
Talk to our team
Apply for the $20M Solana Fund for up to $25k in credits to accelerate your app on Solana.

Alchemy Newsletter

Be the first to know about releases

Sign up for our newsletter

Get the latest product updates and resources from Alchemy

Over 80,000 subscribers

By entering your email address, you agree to receive our marketing communications and product updates. You acknowledge that Alchemy processes the information we receive in accordance with our Privacy Notice. You can unsubscribe anytime.

Blog post page related articles background

Solana Alpenglow consensus upgrade explainer

Technical

Solana Alpenglow: the biggest consensus overhaul in Solana's history

Alpenglow replaces both Proof of History and Tower BFT with a new consensus architecture that drops finality from 12.8 seconds to 150 milliseconds and eliminates 75% of Solana's block space overhead.