The Economics of a Sub-10ms Solana RPC

StreamSync Team · March 22, 2026 ·

architectureeconomics

Most “guaranteed” RPC SLAs are marketing copy. The provider promises 99.9% uptime; if they miss it, you submit a ticket, prove the miss, wait two months, and possibly receive a credit equal to a fraction of one hour’s spend. The number on the page is never load-bearing — it doesn’t change what the operator does, and it doesn’t change what you pay if they fail. It is closer to a brochure than a contract.

We built StreamSync on the opposite premise. The 10ms latency target is enforced by the payment contract itself. If no operator returns a verified response within the window, you are not billed. If an operator returns the wrong response, they get slashed. The SLA is the meter — not a separate document you wave at customer support.

This post is about why that shift matters, what it actually costs to engineer, and what changes for both sides of the market when the SLA becomes the thing being sold.

Why a fictional SLA is the default

Hosted indexing providers run on a “best effort with credit window” model because it’s the only thing that works when one company owns both the infrastructure and the contract. If you promise 10ms and your nearest region is 18ms from your customer, you have two options: refund every query (which kills the business model), or define “10ms” carefully enough that almost no event triggers a refund.

So the SLA gets diluted. “Average latency under 50ms across a rolling 30-day window, measured at the provider edge, excluding planned maintenance and force majeure” is a sentence that protects the vendor, not the customer. The number people show in benchmarks is p50. The number people actually need is p99.

This is not a moral failing on the part of any specific provider. It’s the natural attractor for a market structure with one seller per contract. You cannot price an enforceable SLA when the seller also owns the meter.

The racing market structure

StreamSync changes the seller side. Each query is dispatched to 3–5 independent operators simultaneously. They race. The first correct response wins 70% of the fee; verifying nodes that confirm the answer split the remaining 15% per verifier (capped at two). The operators who lose the race earn nothing for that query — but they also paid almost nothing to race, because the marginal cost of a DuckDB lookup over an indexed shard is fractions of a cent.

The key effect: latency is now a market-cleared quantity. If the cheapest operator in your region runs hot and starts missing the 10ms window, racing routes traffic to the next-fastest operator without anyone filing a ticket. There is no need for a “monitor the provider’s status page” loop in your operations playbook. The protocol pre-monitors for you because losing operators don’t get paid.

Compare to the single-vendor model: if your provider slows down, you submit a JIRA ticket, switch to a backup endpoint behind a feature flag, and pray the migration window is short. With StreamSync, the only thing that changes is which operator’s wallet gets the payout for that batch of queries.

What the 10ms number really means

We want to be precise about what’s being measured because most “sub-10ms” claims are wrong about the boundary.

StreamSync’s SLA is measured from when the query lands at a regional gateway to when a verified response is ready to send back. It does not include client-to-gateway transit, because we cannot control your network. It does include:

Dispatch to operators (NNG over UDP)
Operator-side DuckDB query execution against the relevant shard
Cache lookup or partial result merge across shards
Verifier consensus (at least one verifier must confirm before settlement)
Response handoff back to the gateway

When all of that fits in 10ms, the customer is billed. When it doesn’t, the customer is not billed and the gateway records a missed-SLA event for the participating operators. Persistent miss rates lower an operator’s reputation score, which lowers their probability of being selected for future queries — a slow death by traffic starvation.

Why operators play along

A reasonable objection: why would an operator accept a contract where they earn nothing if they miss the SLA? The answer is the same reason exchanges play along with maker-taker rebates: because the structure rewards the participants who are actually fast.

Operators in StreamSync are paid in proportion to their winning rate. A node that wins 30% of its races at the median price is roughly 30% more profitable than a node that wins 10% at the same price. That creates a strong gradient toward hardware investment, geographic placement near customers, and aggressive caching of hot keys. Operators who can’t compete on latency self-select into the archive node class, which doesn’t race on hot queries at all — it serves historical depth instead, on different SLAs.

The economic decentralization principle pays its way: instead of one company deciding what hardware mix to deploy, dozens of operators each make their own bet about how to win the most races. Some bet on bare metal in NY4. Some bet on dense memory for cache. Some bet on storage for archive. The network ends up with diversity that no single operator could justify alone.

The verification problem

A reader who has built racing systems will spot the next question: how do you verify the winner’s answer fast enough to settle inside the 10ms budget?

The honest answer is: you don’t, fully. You verify the cheap way during the budget window and the expensive way after settlement.

Inside the budget, two verifier nodes hash the response and compare. If their hashes agree with the winner’s, the response is accepted and payment is queued. After settlement, asynchronously, a deeper verification pass can challenge any of the previous batch’s answers; if a challenge succeeds, the original winner is slashed and the challenger receives a bounty. This bifurcation is what lets the protocol pay on outcome at sub-10ms while still keeping correctness anchored to a stronger guarantee on a longer time horizon.

We deliberately accept a small probability of paying for a wrong answer in exchange for the latency. Slashing makes that probability painful enough that operators self-police.

What changes for customers

A customer using StreamSync makes three changes versus a hosted RPC contract.

First, the budgeting model: you prepay a balance instead of receiving a monthly invoice. The protocol debits per successful query. Refunds don’t exist as a category, because there is no charge to refund.

Second, the failure mode: when something goes wrong, the cost shows up as more failed-SLA events in your metrics, not as outages. Your application sees a missing response for that query (it may be fulfilled out-of-band by the gateway with a higher-latency fallback), but your bill doesn’t grow.

Third, the negotiation: there is no negotiation. Operators bid on what they will accept, customers set a ceiling on what they will pay, and the gateway clears the market each epoch. If the market moves against you, you change your ceiling. If you move against the market, you don’t get racers.

The honest cost

We don’t want to pretend the model has no overhead. Racing four operators per query is roughly 4× the compute cost across the network for each successful query. That cost is absorbed by the operators who lose, who priced their bid accordingly. The customer-facing price is competitive with single-vendor RPCs because the market clears at the marginal cost of the marginal operator — but the network-wide work is genuinely higher.

We think that’s worth it for the property the model unlocks: an SLA you don’t have to trust anyone to honor, because the contract enforces it.