Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Ramp Labs proposes a new multi-agent memory sharing solution, reducing token consumption by up to 65%.
ME News, April 11 (UTC+8), AI infrastructure company Ramp Labs released their research results “Latent Briefing,” which achieves efficient memory sharing among multi-agent systems by directly compressing large model KV caches, significantly reducing token consumption without sacrificing accuracy. In mainstream multi-agent architectures, the orchestrator disassembles tasks and repeatedly calls the worker models; as the reasoning chain extends, token usage grows exponentially. The core idea of Latent Briefing is to identify the truly critical parts of the context using attention mechanisms, directly discarding redundant information at the representation layer instead of relying on slow LLM summaries or unstable RAG retrievals. In the LongBench v2 benchmark, this method performed remarkably: worker model token consumption decreased by 65%, median token savings for medium-length documents (32k to 100k) reached 49%, overall accuracy improved by about 3 percentage points compared to the baseline, and each compression added only about 1.7 seconds of latency, speeding up the original algorithm by approximately 20 times. The experiments used Claude Sonnet 4 as the orchestrator and Qwen3-14B as the worker model, covering various document scenarios such as academic papers, legal documents, novels, and government reports. The study also found that the optimal compression threshold varies depending on task difficulty and document length—more aggressive compression is suitable for filtering speculative reasoning noise in difficult tasks, while lighter compression better preserves dispersed key information in long documents. (Source: BlockBeats)