Ramp Labs proposes a new multi-agent memory sharing solution, reducing token consumption by up to 65%.

robot
Abstract generation in progress

ME News, April 11 (UTC+8), AI infrastructure company Ramp Labs released their research results “Latent Briefing,” which achieves efficient memory sharing among multi-agent systems by directly compressing large model KV caches, significantly reducing token consumption without sacrificing accuracy. In mainstream multi-agent architectures, the orchestrator disassembles tasks and repeatedly calls the worker models; as the reasoning chain extends, token usage grows exponentially. The core idea of Latent Briefing is to identify the truly critical parts of the context using attention mechanisms, directly discarding redundant information at the representation layer instead of relying on slow LLM summaries or unstable RAG retrievals. In the LongBench v2 benchmark, this method performed remarkably: worker model token consumption decreased by 65%, median token savings for medium-length documents (32k to 100k) reached 49%, overall accuracy improved by about 3 percentage points compared to the baseline, and each compression added only about 1.7 seconds of latency, speeding up the original algorithm by approximately 20 times. The experiments used Claude Sonnet 4 as the orchestrator and Qwen3-14B as the worker model, covering various document scenarios such as academic papers, legal documents, novels, and government reports. The study also found that the optimal compression threshold varies depending on task difficulty and document length—more aggressive compression is suitable for filtering speculative reasoning noise in difficult tasks, while lighter compression better preserves dispersed key information in long documents. (Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin