The Hyperscale Lottery: How State-Space Models Have Sacrificed Edge Efficiency
cs.AR
/ Authors
/ Abstract
The Hardware Lottery posits that research directions are dictated by available silicon compute platforms. We identify a derivative phenomenon, the Hyperscale Lottery, where model architectures are optimized for cloud throughput at the expense of algorithmic efficiency. While State-Space Models (SSMs) such as Mamba were lauded for their linear complexity, ideal for edge intelligence, their evolution from Mamba-1 to Mamba-3 reveals a systematic divergence from edge-native efficiency. We demonstrate that Mamba-3's architectural changes, designed to saturate hyperscale GPUs, impose a significant edge penalty: a 28% latency increase at 880M parameters, worsening to 48% for 15M-parameter models. We argue for decoupling cloud-scale saturation strategies from core architectural design to preserve the viability of single-user, real-time edge intelligence.