While the tech industry enthusiastically discusses the energy efficiency of Mamba and Mamba-2 as potential 'transformer-killers,' researchers from Ghent University–imec have uncovered a critical structural flaw. A new study by Alexandre Le Mercier, Chris Develder, and Thomas Demester introduces Hidden State Poisoning Attacks (HiSPA)—a method that irreversibly overwrites a model's hidden states using short trigger phrases. The problem lies in the physics of the process: unlike the Attention mechanism, which keeps the entire context in view, State Space Models (SSMs) rely on a compressed 'internal summary.' One precise strike, and this summary is wiped clean, inducing a form of partial amnesia in the neural network.
To test this hypothesis, the authors developed the ROBENCH-25 benchmark, and the results are a wake-up call for current SSM implementations under adversarial pressure. AI21 Labs' Jamba-1.7-Mini—a 52-billion parameter hybrid model—was particularly hard hit. In testing, the model completely lost its ability to retrieve information after encountering HiSPA triggers, whereas classic transformers remained stable. It turns out that the selectivity mechanism giving Mamba its linear complexity is also its Achilles' heel: the hidden state can be literally 'flushed out' via a black-box attack that requires no complex optimization. This isn't just a theoretical bug; it’s a tool that exponentially boosts the success rate of standard prompt injections.
For businesses deploying RAG systems or automated document analytics, this represents a direct security threat. An attacker only needs to embed a trigger in a PDF or a research paper to 'blind' the AI to entire sections of text. Consequences range from critical authorization errors to bypassing safety guardrails and ignoring confidentiality markers. In agentic systems, where planning depends entirely on context integrity, such a vulnerability is fatal. Tech leads must face a hard truth: until hidden state protection catches up with the architectural ambitions of SSMs, any savings on compute costs will come bundled with a fundamental fragility against targeted data poisoning.