Image source: Anthropic Co-Founder Tweet
In AI discussions, conclusions often take center stage, while the reasoning behind them is easily overlooked. This dynamic is especially evident in debates around Recursive Self-Improvement (RSI). On the surface, the main point of contention is a bold claim: by 2028, there’s a significant probability that AI will achieve self-reinforcing R&D capabilities. The deeper issue, however, is whether we have already observed enough "systematic early signals" to move this scenario from a fringe hypothesis into the core risk set that mainstream decision-makers must address.
This question carries weight for both policy and industry because RSI is not simply an abstract "general intelligence myth." Instead, it’s an engineering challenge: can AI take on an increasing number of high-value steps within R&D workflows and connect these steps into a continuously iterative closed loop? Once such a loop is established, the pace of technological progress changes, organizational capability gaps are redefined, and traditional regulatory cycles are disrupted.
Therefore, the RSI debate should move beyond "belief or disbelief" and focus on whether the evidence is sufficient, extrapolations are prudent, and preparations are adequate.
The strongest evidence supporting RSI isn’t a single model’s breakthrough; it’s the synchronized progress across tasks, scenarios, and evaluation frameworks. Commonly cited benchmarks—research reproducibility, post-training optimization, real-world competition problem-solving, and software engineering challenges—all show upward trends to varying extents. The real value lies in their "directional consistency," not just their "absolute values": when multiple proxy metrics improve together over time, it typically signals a broad-based enhancement of underlying capabilities.
However, there are three key limitations to recognize:
Benchmark environments differ from real-world settings. Benchmarks have clear boundaries, stable feedback, and repeatable evaluation standards. In actual R&D, you face goal drift, cross-team collaboration, tacit knowledge transfer, resource constraints, and institutional friction. Success in controlled environments doesn’t automatically translate to reliable organizational output.
Metric visibility doesn’t equal complete capability. Current benchmarks measure "problem-solving ability" more easily but struggle to fully capture higher-order R&D behaviors—like problem definition, priority trade-offs, failure attribution, and cross-cycle governance. In short, models may get better at "solving the right problems," but not necessarily at "consistently doing the right things."
Extrapolating trends can be disrupted by bottleneck migration. History shows that technological progress isn’t always linear. As one bottleneck is overcome, new ones may emerge in data quality, hash rate costs, system reliability, compliance, or social acceptance. Ignoring these second-order constraints can lead to overestimating progress and underestimating resistance.
Thus, consistent multi-benchmark progress is a strong signal, but not definitive proof. It tells us the "direction matters," not that "the timeline is fixed."
The real debate about RSI isn’t whether "AI is getting stronger," but whether "the gains are enough to form a closed loop." A true closed loop involves at least five sequential steps: information intake and literature review, hypothesis generation, experiment design and execution, result evaluation and error diagnosis, and strategy update and re-iteration. Improving one step boosts efficiency, but only robust integration across steps produces compounding returns.
Currently, we see progress mainly in the first three and part of the fourth: models are increasingly efficient at code generation, experiment scripting, literature summarization, and parameter search. The hardest parts of the closed loop usually come down to two capabilities:
Robust diagnosis: Can the system accurately pinpoint root causes amid noisy data, conflicting signals, or sporadic failures, instead of just applying superficial fixes?
Goal alignment: Can the system consistently execute "long-term effective but short-term suboptimal" strategies under multiple constraints, rather than just maximizing local scores?
That’s why "can do" doesn’t mean "can be accountable." A closed R&D loop is not simply the sum of model capabilities—it’s the product of technology, process design, and responsibility structures. Without clear accountability and audit mechanisms, organizations will struggle to delegate authority safely, even if the technology is nearly ready.
Saying "60% by 2028" is useful for communication—it forces the public to recognize that the time window might be shorter than expected. But in decision-making, such figures should be seen as subjective probabilities, not precise statistical estimates. A more practical approach is to convert point probabilities into a "scenario-threshold" framework.
Three scenario levels are useful:
Baseline: AI is deeply integrated into R&D, but humans still make key decisions—a "high automation, human fallback" model.
Acceleration: AI achieves quasi-closed-loop iteration in several domains, sharply shortening R&D cycles and giving leaders a compounding advantage.
High-impact: Cross-domain closed-loop capabilities emerge, model iteration outpaces regulatory adaptation, and governance pressures intensify.
For each scenario, set clear threshold metrics instead of arguing over specific years. Examples include: unattended continuous iteration duration, cross-task transfer success rate, anomaly detection recall rate, auto-rollback success rate, and the proportion of manual intervention at key nodes. When thresholds are reached, governance actions are triggered; when they fall, constraints are relaxed. This approach transforms abstract predictions into actionable management.
If RSI or quasi-RSI emerges, industry competition will shift from "model performance" to "closed-loop operations." Winning will depend less on who has the biggest model, and more on who can build shorter, more stable, and more controllable R&D cycles within real organizations.
Organizational boundaries will be redrawn. Traditional R&D processes—once a sequence of specialized roles—will become collaborative networks of "a few key people + large-scale AI affiliates." Roles won’t simply disappear; they’ll migrate toward system orchestration, quality control, and risk governance.
Efficiency gains will be nonlinear. Organizations that automate processes first may achieve generational advantages in iteration speed, cost structure, and scale of experimentation. Those that only introduce AI in isolated areas will see more linear, incremental improvements—struggling to close structural gaps.
"Trustworthy R&D capability" will become the new competitive moat. Future high-value competitiveness won’t just be about being "fast," but about being "fast and demonstrably safe." Traceable logs, reproducible experiments, strategy change audits, and incident response systems will shift from compliance costs to assets of market trust.
As acceleration becomes possible, governance shouldn’t aim to halt progress but to establish "verifiable controllability." This requires advancing technical and institutional governance in parallel.
Technically, safety must be integrated into the R&D pipeline: default logging of key decisions, dual approval for high-risk actions, sandbox boundaries for model self-modification, and mandatory review of anomalous performance jumps. The core principle: "observability before delegation."
Institutionally, adopt tiered governance—not one-size-fits-all. Allow flexibility for low-risk applications, but require higher transparency and accountability for high-impact systems, with mechanisms for dynamic updates. Static rules can’t keep up with rapid iteration; regulation itself must be able to "continuously recalibrate."
Organizationally, "human responsibility anchors" must be explicit. When AI participates in R&D and deployment decisions, key points must have identifiable, accountable human signatories. Automation without responsibility anchors only increases speed, not quality.
Returning to the central question: is this perspective valid? The direction is valid, but the expression must be cautious. It’s valid because it highlights that AI is advancing across multiple R&D dimensions, and the closed-loop tipping point may arrive sooner than expected. Caution is essential, since any specific date or probability inevitably involves subjective assumptions and tends to underestimate real-world friction.
For decision-makers, the best approach isn’t to swing between optimism and pessimism, but to build resilience amid uncertainty:
On one hand, prepare as if acceleration "could happen sooner," avoiding passive responses at critical moments. On the other, constrain system expansion with layered scenarios, quantifiable thresholds, and responsibility anchors, ensuring capability growth remains within controllable bounds.
If the last phase of AI was about "enabling machines to complete tasks," the next, more crucial question is: as machines begin to help create the next generation of machines, can humanity evolve its governance and responsibility systems in step?
This isn’t just a technical forecasting challenge—it’s about redefining the future order of innovation.





