Self-Improving Agents: When AI Starts Improving Itself
3 min read

Self-Improving Agents: When AI Starts Improving Itself

609 words

Recently, Addy Osmani published an article that gave me much to think about: “Self-Improving Coding Agents”. The idea is simple but powerful: agents that not only execute tasks, but improve their own performance over time.

This isn’t science fiction. It’s happening now, in 2026. And it has profound implications for the future of software development and, by extension, for all professions.

What is a Self-Improving Agent?

A self-improving agent is an AI system with the capacity to:

  1. Evaluate its own performance - Know when it’s doing well or poorly
  2. Learn from its mistakes - Modify behavior based on failures
  3. Optimize its processes - Improve how it approaches tasks
  4. Update autonomously - Incorporate improvements without direct human intervention

Addy’s Example: Cyclical Coding Agents

The key concept Addy presents is the continuous improvement cycle:

# Pseudocode of the flow
while there are failed tasks:
    for each task:
        generate solution
        execute tests
        if passes:
            mark as completed
        if fails:
            analyze error
            generate correction
            retry
    update task list

The brilliance isn’t each individual step, but the complete autonomous cycle. The agent doesn’t need a human to say “this test failed, fix it.” It detects it, corrects it, and learns from the process.

The Three Critical Capabilities

According to industry analysis, an effective self-improving agent needs:

  1. Autonomy - Operate independently without constant supervision
  2. Learning - Capacity to acquire new skills and knowledge
  3. Self-improvement - Modify its own algorithms, parameters, and decision processes

Without all three, it’s not really a self-improving agent. It’s just a script with some ML models underneath.

The 80% Problem

Addy also mentions the “80% Problem in Agentic Coding”:

“Agents can rapidly generate 80% of the code, but the remaining 20% requires deep knowledge of context, architecture, and trade-offs.”

This is the current challenge. Agents are brilliant at “obvious” tasks, but struggle with:

  • Subtle architectural decisions
  • Long-term design trade-offs
  • Tacit domain knowledge
  • Experience-based judgments

My reading: this won’t change soon. Agents will continue to be powerful tools for the 80%, but the 20% will still need experienced humans.

What This Means for Us

The Developer as Orchestrator

If agents can do 80% of the work… what do we do?

I think the developer role is evolving toward:

  • Orchestrator - Design multi-agent systems that work together
  • Architect - Define system structure and constraints
  • Reviewer - Review and curate agent outputs
  • Problem-solver - Tackle the difficult 20% agents can’t solve

My Personal Take

I have to admit: Addy’s mentioned transition from 80/20 to 20/80 resonates with me a lot. I’m also doing much more “orchestration” and much less “pure coding” than a year ago.

Is it good? Is it bad? Neither. It’s different.

What worries me is the skills gap that’s opening:

  • Developers who adopt agents → Super-productive
  • Developers who don’t adopt agents → Left behind
  • Developers who only know how to use agents → Superficial

The sweet spot is in the middle: knowing how to use agents but understanding what they’re doing.

Conclusion

Self-improving agents aren’t the future. They’re the present.

And it’s not hype. It’s a natural evolution of AI we’re seeing: from passive tools to autonomous systems that learn and improve.

The question isn’t “will agents replace us?” The question is: “how do we work with agents to be 10x more productive?”

My answer: understanding they’re tools, not magic. And that human value is in thinking critically about what problem to solve, how to solve it, and whether the solution the agent produced is really good.

The rest? Well, agents already do that pretty well.

References

Comments

Latest Posts

7 min

1438 words

A few days ago I discovered Agent Lightning, a Microsoft project that I believe marks a before and after in how we think about AI agent orchestration. It’s not just another library; it’s a serious attempt to standardize how we build multi-agent systems.

What is Agent Lightning?

Agent Lightning is a Microsoft framework for orchestrating AI agents. It enables composition, integration, and deployment of multi-agent systems in a modular and scalable way. The premise is simple but powerful: agents should be components that can be combined, connected, and reused.

9 min

1747 words

If you’re using tools like Claude Code, GitHub Copilot Workspace, or similar, you’ve probably noticed there’s technical jargon that goes beyond simply “chatting with AI”. I’m talking about terms like rules, commands, skills, MCP, and hooks.

These concepts are the architecture that makes AI agents truly useful for software development. They’re not just fancy marketing words — each one serves a specific function in how the agent works.

Let’s break them down one by one in a clear way.

4 min

810 words

A few days ago I came across a very interesting stream where someone showed their setup for agentic programming using Claude Code. After years developing “the old-fashioned way,” I have to admit that I’ve found this revealing.

What is Agentic Programming?

For those not familiar with the term, agentic programming is basically letting an AI agent (in this case Claude) write code for you. But I’m not talking about asking it to generate a snippet, but giving it full access to your system so it can read, write, execute, and debug code autonomously.

5 min

949 words

Lately, there’s been talk of AI agents everywhere. Every company has their roadmap full of “agents that will revolutionize this and that,” but when you scratch a little, you realize few have actually managed to build something useful that works in production.

Recently I read a very interesting article by LangChain about how to build agents in a practical way, and it seems to me a very sensible approach I wanted to share with you. I’ve adapted it with my own reflections after having banged my head more than once trying to implement “intelligent” systems that weren’t really that intelligent.

6 min

1097 words

Two protocols, two philosophies

In recent months, two protocols have emerged that will change how we build AI systems: Agent2Agent Protocol (A2A) from Google and Model Context Protocol (MCP) from Anthropic. But here’s the thing: they don’t compete with each other.

In fact, after analyzing both for weeks, I’ve realized that understanding the difference between A2A and MCP is crucial for anyone building AI systems beyond simple chatbots.

The key lies in one question: Are you connecting an AI with tools, or are you coordinating multiple intelligences?

6 min

1248 words

A few years ago, many AI researchers (even the most reputable) predicted that prompt engineering would be a temporary skill that would quickly disappear. They were completely wrong. Not only has it not disappeared, but it has evolved into something much more sophisticated: Context Engineering.

And no, it’s not just another buzzword. It’s a natural evolution that reflects the real complexity of working with LLMs in production applications.

From prompt engineering to context engineering

The problem with the term “prompt engineering” is that many people confuse it with blind prompting - simply writing a question in ChatGPT and expecting a result. That’s not engineering, that’s using a tool.