Bitcoin Schnorr Signatures and Key Aggregation Explained

The Problem With Proving Five People Agreed

You're the CFO. Three of your five executives need to approve any treasury spend, and you've just submitted a transaction to the Bitcoin network that broadcasts, in plain sight, exactly how your security is structured. Every signature, every public key, stacked on-chain like a safe with the combination written on the door.

Schnorr signatures fix this. With key aggregation, those five executives collapse into a single signature that looks, on-chain, identical to a single-person spend. One key, one signature, done. The mechanics behind that trick are worth understanding properly.

What Makes Schnorr Different From What Came Before

Bitcoin launched with ECDSA (Elliptic Curve Digital Signature Algorithm), and it worked fine for a decade. But ECDSA has a structural quirk: signatures aren't linearly combinable. Two valid ECDSA signatures can't simply be added together to produce a third valid signature covering both. Each one has to stand alone and be verified alone.

Schnorr signatures, formalized for Bitcoin in BIP-340 and activated as part of Taproot, are different in one mathematically important way: they're linear. That linearity is the entire engine. It means you can perform algebraic operations across multiple signatures and keys and have the result still satisfy the verification equation.

Here's the core equation. A signer with private key `x` and public key `P = x·G` (where `G` is the elliptic curve generator point) produces a signature `(R, s)` where:

`s = r + H(R, P, m) · x`

`r` is a secret nonce, `R` is its corresponding curve point, `H` is a hash function, and `m` is the message. Verification checks that `s·G = R + H(R, P, m)·P`. The linearity lives in that `s` equation: `s` values from multiple signers can be summed, because addition distributes cleanly across all the terms.

The Key Aggregation Mechanism, Step by Step

The specific protocol used in Bitcoin's Taproot context is MuSig2 (the second iteration of the MuSig scheme, specified in BIP-327). Take a concrete scenario: Alice, Bob, and Carol each hold a key and all three must sign.

First, key aggregation. Each participant shares their public key. The combined public key `P_agg` is not simply `P_alice + P_bob + P_carol`. That naive sum is vulnerable to a rogue-key attack, where Carol could submit a crafted key like `P_carol_fake = P_real - P_alice - P_bob`, making her key alone sufficient to satisfy the sum. MuSig2 prevents this by hashing each participant's key alongside the full list of keys and multiplying by a per-participant coefficient before summing:

`P_agg = a_alice·P_alice + a_bob·P_bob + a_carol·P_carol`

where each `a_i = H(L, P_i)` and `L` is the sorted list of all public keys. Now no single party can manipulate the aggregate.

Next, the signing round. Each signer generates two nonce pairs (MuSig2 uses two to remain secure even if one nonce is reused accidentally, which was a vulnerability in the original MuSig). They exchange nonce commitments, then the actual nonces, then compute partial signatures:

`s_i = r_i + H(R_agg, P_agg, m) · a_i · x_i`

The final signature is `s = s_alice + s_bob + s_carol`, combined with the aggregated nonce point `R_agg`. That single `(R_agg, s)` pair verifies against `P_agg` using the standard Schnorr equation. Anyone checking the blockchain sees one public key and one signature. They cannot tell whether one person signed or thirty.

What People Get Wrong About This

The most common misconception is that Schnorr key aggregation is automatic and costless. It isn't.

The interactive rounds matter. MuSig2 requires two rounds of communication between signers before a signature can be produced. All participants need to be online and responsive at roughly the same time. For a company treasury that might mean a Slack thread and an API call. For a protocol where signers are scattered across time zones and some are hardware wallets locked in a vault, coordination adds real friction. This is not a protocol for fire-and-forget signing.

Key aggregation as described above is also an all-or-nothing scheme: every listed key must participate. The "k-of-n" threshold case (say, any 3 of 5 keys) requires a different construction, typically using Taproot's script path with FROST (Flexible Round-Optimized Schnorr Threshold signatures) or similar threshold schemes. MuSig2 alone handles the n-of-n case only. Most explainers skip this distinction. It matters enormously when you're designing a custody setup.

And while the aggregate looks like a single-signer transaction on-chain, the participants themselves know who they are. Privacy is gained against external observers, not against the signers.

The Fee and Privacy Arithmetic

The gains are concrete enough to quantify. A legacy 2-of-3 multisig transaction using P2SH spent roughly 300-400 vbytes. A Taproot keypath spend, where all signers cooperate and produce an aggregated Schnorr signature, costs around 57.5 vbytes for the input. At any given fee rate, that's a reduction of roughly 80% in input weight for the cooperative case. On a wallet processing thousands of treasury transactions a year, that compounds into a meaningful operating cost difference.

Privacy is the less-obvious gain, and frankly it's the more underappreciated one. Before Taproot, a 2-of-3 P2SH address was visually identifiable as such by any block explorer or chain-analysis firm. Taproot keypath spends are indistinguishable from single-key spends. A Lightning Network channel close, a multisig treasury payment, and your friend sending you 0.01 BTC all look the same on-chain. That uniformity is the anonymity set, and Schnorr aggregation dramatically expands it.

Consider two colleagues, Maya and Priya, who each set up 2-of-3 multisig wallets for their respective businesses at the same time. Maya used legacy P2SH; Priya used Taproot with MuSig2. A chain-analysis firm profiling transactions can immediately identify Maya's outputs as multisig and flag them for enhanced scrutiny. Priya's transactions are invisible in the crowd, indistinguishable from a single grandmother sending sats to her nephew. Same security model, very different exposure.

So here's the question worth sitting with: if the privacy gain is this concrete, why are so many custody providers still defaulting to legacy multisig constructions that advertise themselves to every chain-surveillance firm on the market?

One Key, But Not a Simpler Life

Schnorr key aggregation is one of those improvements that sounds like a free lunch until you read the full specification. The cryptography is genuinely elegant, like a long algebraic sentence that somehow resolves into a single clean word. But the engineering around it, the coordination requirements, the distinction between n-of-n and threshold schemes, the two-round nonce protocol in MuSig2, adds real complexity that sits below the surface of that clean single signature.

The on-chain result looks simple because a lot of careful work happened off-chain to produce it. That's not a criticism. That's exactly how good cryptographic engineering is supposed to work: complexity lives where developers can manage it, not where every node on the network has to pay for it.

The catch: if you're evaluating a custody solution that claims Schnorr-based multisig, the right question isn't whether it uses Schnorr. It's whether the coordination layer is robust enough that the two-round signing protocol won't become the single point of failure you were trying to design around in the first place.