Google Research Releases ReasoningBank: AI Agents Learn Reasoning Strategies from Success and Failure

Gate News message, April 22 — Google Research released ReasoningBank, an agent memory framework that enables large language model-driven agents to continuously learn after deployment. The framework extracts universal reasoning strategies from both successful and failed task experiences, storing them in a memory bank for retrieval and execution on similar future tasks. The associated paper was published at ICLR, and code has been open-sourced on GitHub.

ReasoningBank improves upon two existing approaches: Synapse, which records complete action trajectories but has limited transferability due to fine-grained granularity, and Agent Workflow Memory, which only learns from successful cases. ReasoningBank makes two key changes: storing “reasoning patterns” instead of “action sequences,” with each memory containing structured fields for title, description, and content; and incorporating failure trajectories into learning. The framework uses a model to self-evaluate execution trajectories, transforming failure experiences into anti-pitfall rules. For example, the rule “click Load More button when seen” evolves into “verify current page identifier first, avoid infinite scrolling loops, then click load more.”

The paper also introduces Memory-aware Test-time Scaling (MaTTS), which allocates additional compute during inference to explore multiple trajectories and store findings in the memory bank. Parallel expansion runs multiple distinct trajectories for the same task, refining more robust strategies through self-comparison; sequential expansion iteratively refines a single trajectory, storing intermediate reasoning in memory.

On WebArena browser tasks and SWE-Bench-Verified coding tasks using Gemini 2.5 Flash as a ReAct agent, ReasoningBank achieved 8.3% higher success rate on WebArena and 4.6% higher on SWE-Bench-Verified compared to a baseline without memory, reducing average steps per task by approximately 3. Adding MaTTS with parallel expansion (k=5) further improved WebArena success rate by 3 percentage points and reduced steps by an additional 0.4.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

OpenAI Recruits Top Enterprise Software Talent as Frontier Agents Disrupt Industry

Gate News message, April 26 — OpenAI and Anthropic have been recruiting senior executives and specialized engineers from major enterprise software companies including Salesforce, Snowflake, Datadog, and Palantir. Denise Dresser, former CEO of Slack under Salesforce, joined OpenAI as chief revenue of

GateNews6h ago

Worxphere Rebrands JobKorea With AI-Powered Hiring Tools

Gate News message, April 26 — South Korean HR platform Worxphere has rebranded JobKorea as it transitions from traditional online job boards to AI-driven hiring solutions. The company is consolidating services including JobKorea and Albamon into a unified platform covering permanent employment,

GateNews16h ago

AI Agents can already independently recreate complex academic papers: Mollick says most errors come from human original text rather than AI

Mollick points out that publicly available methods and data can allow AI agents to reproduce complex research without the original paper and code; if the reproduction does not match the original paper, it is usually due to errors in the paper’s own data processing or overextension of the conclusions, rather than the AI. Claude first reproduces the paper, and then GPT‑5 Pro cross-validates it; most attempts succeed, but they are blocked when the data is too large or when there are issues with the replication data. This trend greatly reduces labor costs, making reproduction a widely actionable form of verification, and it also raises institutional challenges for peer review and governance, with government governance tools or becoming a key issue.

ChainNewsAbmedia18h ago

UAE Announces Shift Toward AI Government Model in the Next Two Years

His Highness Sheikh Mohammed bin Rashid Al Maktoum stated that the goal was for 50% of government sectors to operate through autonomous agentic AI. The transition will also include the training of federal employees to “master AI” and will be overseen by Sheikh Mansour bin Zayed. Key Takeaways:

Coinpedia04-25 08:39

AI Trading Platform Fere AI Raises $1.3M in Funding Led by Ethereal Ventures

Gate News message, April 25 — Fere AI, an AI-powered digital asset trading platform, announced the completion of a $1.3 million funding round led by Ethereal Ventures, with participation from Galaxy Vision Hill and Kosmos Ventures, according to Globenewswire. The platform supports cross-chain

GateNews04-25 07:46

Nvidia Deploys OpenAI Codex AI Agent Across Entire Workforce on Blackwell Infrastructure

Gate News message, April 25 — Nvidia has rolled out OpenAI's Codex, an AI agent powered by GPT-5.5, to its entire workforce following a successful trial with approximately 10,000 employees, according to internal communications from CEO Jensen Huang and OpenAI CEO Sam Altman. Codex is designed to as

GateNews04-25 03:11
Comment
0/400
No comments