Wave routing uses a small set of wave parameters; the rest of the weights can be quantized more aggressively than in a typical transformer. Exact sizes, perplexity numbers, and KV-cache comparisons are not published here.
See GitHub for the full write-up.