Reconstructing Compressed NFTs in Production

StreamSync Team · May 8, 2026 ·

engineeringcompression

State compression was the right answer to a real problem. Storing every Solana NFT as a full account did not scale. The Merkle-tree-with-on-chain-proofs model that Metaplex and the Solana Labs team landed on is genuinely elegant, and it dropped the cost of minting an NFT by something like four orders of magnitude.

It also broke the indexing model that every Solana tool had been using.

This post is about how StreamSync handles compressed account reconstruction in production — what the work actually is, why we give it its own node class, and what we learned shipping the zk-reconstruction crate against real workloads.

What compression broke

Pre-compression, an NFT was an account. You wanted to know its owner, its metadata URI, its collection — you read the account and parsed it. Maybe you used Metaplex’s helpers. Either way, the data was sitting in plain bytes on chain, and an indexer’s job was bookkeeping: keep a copy of the bytes, decode them once, hand them out fast.

Post-compression, an NFT is not an account. It’s a leaf in a Merkle tree. The tree’s root hash is on chain. The leaf’s data lives in the tree. To get the data back, you need:

The root hash at the slot you care about.
The path of sibling hashes from the leaf up to the root.
The leaf data itself, which exists only as historical Geyser-stream events or cached state from before the next mutation.

This is fine for a one-off lookup. It is not fine for the queries indexers actually receive — “give me all compressed NFTs in this collection owned by this wallet” — because that involves reconstructing a few thousand leaves from a tree that may have been mutated thousands of times since the leaves were minted.

The hosted indexer industry’s first answer was to pre-decompress everything: ingest the Geyser stream, reconstruct each leaf as it appears, store the decompressed form in Postgres, query Postgres. This works, but the storage cost grows as the NFT supply grows, and the cost is paid by whoever runs the indexer regardless of whether anyone ever queries the data.

Why a separate node class

The fundamental problem with the pre-decompress-everything strategy is that it priced indexing at the cost of the largest customer’s worst case. If even one customer wanted to query all of a giant collection, the indexer had to store all of it forever. Everyone else paid for that.

StreamSync’s structural answer is the ZK reconstruction node class. These operators specialize in on-demand reconstruction. They subscribe to the Geyser-derived event stream like everyone else, but instead of pre-decompressing every leaf, they hold the proof structures and recent root states and reconstruct leaves when queries arrive.

The trade-off is latency. A “give me this compressed NFT by mint” query that would take 2ms against a pre-decompressed Postgres takes 6-9ms against a ZK reconstruction node. That fits inside the 10ms SLA, but barely, and the operators who run this class have to invest in CPU and memory to keep it there.

In exchange, the customer-facing pricing for compressed NFT lookups can be priced per query at the actual cost of reconstruction, rather than as a flat amortized cost across all queries. Customers who hit one compressed NFT a day pay essentially nothing; customers who scan an entire collection pay proportionally for the work.

This is a recurring pattern in the StreamSync design: don’t pick the strategy at the protocol level; pick the structure that lets the market price each strategy fairly.

What the reconstruction code does

The zk-reconstruction crate is small — eight tests at the time of writing — because the algorithm is straightforward once you have the right data sources.

The inputs are: a target leaf identity (typically a Metaplex asset ID), a target slot (often “latest”), and the current state of the relevant Merkle tree. The crate’s job is to identify the most recent leaf state at or before the target slot and emit a structured representation of the leaf data.

The interesting work is in the data layout, not the algorithm. We keep per-tree append-only logs of leaf-mutation events, indexed by mint and by collection. When a reconstruction request comes in, the crate seeks into the log at the right place, replays the mutations forward until it has the leaf state at the target slot, and returns. Replays are cheap because each step is a single hash and a pointer update; the cost is dominated by the log read.

What we deliberately do not do: store a separate copy of every decoded leaf. The log is the source of truth; the decoded form is materialized only when queried.

What we got wrong the first time

Two things.

First, we tried to use DuckDB to store the mutation logs. DuckDB is a great analytical engine but it is not the right answer for append-heavy single-row writes; the WAL pressure was painful enough that we moved the log storage out to a purpose-built append log file format and kept DuckDB for query-time aggregation across reconstructed leaves. That refactor cost us six weeks of operator-facing development time. It was the right call but it stings.

Second, we under-provisioned for the metadata cache. Solana NFTs have URIs that point to off-chain metadata; some collections are good citizens about this and serve metadata from Cloudflare, others are not and serve from a single self-hosted server that goes down regularly. We initially treated off-chain metadata as part of the reconstruction; we now serve a “without metadata” result fast and let the caller decide whether they want us to also block on the off-chain fetch. This dropped the p99 latency by an order of magnitude for collections with flaky URI hosts.

The result in production

A typical ZK reconstruction query in a populated region completes in 5-9ms end to end, including the gateway round trip. That is genuinely fast for a workload that used to require a multi-second round trip against a hosted indexer’s REST API.

For sustained collection scans — “give me every compressed NFT in this collection currently owned by this wallet” — the gateway dispatches a sharded sub-query plan: each ZK operator reconstructs the leaves in its assigned key range, returns the matching ones, and the cache-optimizer merger does the union and the final filter. The whole thing finishes in 30-60ms for collections with thousands of NFTs, which is competitive with the best pre-decompressed approaches but doesn’t require anyone to have paid the storage cost upfront.

What this means for the network

The ZK reconstruction class is small relative to speed runners and cache optimizers, but its existence is what lets StreamSync claim a complete Solana surface area. We are not a “fast for the easy queries, sorry about the hard ones” indexer. We pay operators directly to handle the hard ones, and we route work to them only when it is needed.

That’s the structural answer to a recurring question: how do you build a multi-operator network that is competitive on the workloads that benefit from specialization? You make specialization a first-class economic role, you reward the operators who do the hard work, and you let the customer pay the actual cost of what they asked for.