If you evaluated Microsoft AutoGen sometime in the past two years and then lost track of the project, you may have missed a significant rupture. The original AutoGen repository forked — and two distinct projects now compete for the same name recognition. For any team deciding where to build a multi-agent system, that split is not a minor footnote. It changes which codebase you’d actually be committing to.
This article cuts through the confusion. We explain what happened, what each branch does well, and give a plain verdict on whether AutoGen or its fork AG2 belongs in a production build today — particularly inside an Azure-heavy or Microsoft-centric IT estate.
What Actually Happened: AutoGen, AG2, and the Microsoft Agent Framework
AutoGen originated as a Microsoft Research project. Its core idea — letting multiple LLM-backed agents converse with each other to solve problems collaboratively — was genuinely novel when it launched, and the research community embraced it fast.
The fork happened because Microsoft’s ambitions for the project outgrew its research-lab origins. In late 2024, a significant portion of the original AutoGen contributors departed and relaunched the project under the name AG2 (also surfaced under the package name ag2 and the community domain ag2ai). Their stated goal: maintaining an open, community-driven governance structure independent of any single corporate roadmap.
Meanwhile, Microsoft rebuilt what remained of the original AutoGen lineage. The immediate result was AutoGen 0.4 — a ground-up rewrite with async, event-driven internals released in January 2025. That was not the end of the story: in October 2025 Microsoft retired AutoGen 0.4 into maintenance mode (bug fixes and security patches only, no new features) and launched the Microsoft Agent Framework as its official successor, merging AutoGen and Semantic Kernel into a single SDK. The Agent Framework reached v1.0 in April 2026 and is now the active Microsoft-backed path. AutoGen 0.4+ is therefore a feature-frozen, deprecated framework as of this writing. On the managed side, these concepts feed into Azure AI Agent Service and the broader Azure AI Foundry stack.
So when someone says “I’m using AutoGen,” they might mean:
- AG2 (the community fork): The more actively maintained Python framework, closest in spirit to the original multi-agent conversation model. Lives at
github.com/ag2ai/ag2. - Microsoft AutoGen 0.4: Rewritten internals, tighter Azure integration, diverging API surface. Now in maintenance mode — no new features. Lives at
github.com/microsoft/autogen. - Microsoft Agent Framework (v1.0, April 2026): Microsoft’s active successor, merging AutoGen and Semantic Kernel. Python and .NET (C#) supported as co-equal languages. The build target to use if you are committed to the Microsoft ecosystem.
- Azure AI Foundry Agent Service: Microsoft’s hosted/managed direction, powered by the Agent Framework, which is a different product category entirely.
For most decision-makers evaluating this for a production build, the practical question is: AG2 or the Microsoft-maintained AutoGen lineage? They are no longer the same thing.
Where Multi-Agent Conversation Shines — and Where AutoGen/AG2 Pioneered It
The original insight behind AutoGen was important: many non-trivial AI tasks are better handled by a structured conversation between specialized agents than by one overloaded generalist. A planner agent breaks down a goal, a coder agent writes the script, a reviewer agent checks it, an executor agent runs it and reports back. Each agent has a defined role, and the conversation protocol coordinates them.
This model maps well onto:
- Research and analysis workflows — synthesizing multiple sources, cross-checking findings, iterating on drafts
- Software development assistance — multi-step coding tasks with review and test loops
- Internal Q&A with tool access — agents that can query databases, call APIs, then reason over the results before responding
The conversational, message-passing architecture means you get observable, debuggable reasoning chains. You can see what each agent said to the others and why a decision was made. For regulated industries or risk-conscious operations, that auditability matters. See our piece on multi-agent systems for a broader look at when this architecture makes sense.
AG2 vs Microsoft AutoGen: Which Branch to Build On Today
Here is a direct comparison across the dimensions that matter for a business build:
| Dimension | AG2 (community fork) | Microsoft Agent Framework |
|---|---|---|
| API stability | More stable; closer to original API | v1.0 since April 2026; earlier 0.4 had breaking changes |
| Community activity | High; original core maintainers | Microsoft-backed; AutoGen 0.4 community migrating to Agent Framework |
| Azure integration | Manual; bring your own connectors | First-class Azure OpenAI, AI Foundry support |
| Production tooling | Improving but still limited | Optimised for Microsoft stack |
| Language support | Python only | Python and .NET (C#) as co-equal first-class languages |
| Non-Azure LLM support | Good | Possible but clearly non-primary |
| Hosted/managed option | None | Azure AI Foundry Agent Service |
The verdict: If you are building on Azure and your IT estate is Microsoft-centric, the Microsoft Agent Framework — or Azure AI Foundry Agent Service — is the pragmatic path. Tighter integration, a real support model, and an actively developed SDK. Note that AutoGen 0.4 itself has entered maintenance mode; if you are starting a new build, target the Agent Framework directly.
If you want a neutral Python framework for multi-agent workflows and are not Azure-bound, AG2 is the more active project and better preserved the original design intent.
Is AutoGen / AG2 Actually Production Ready?
This is where honest assessment requires separating “can you build something that works” from “is it hardened for business operations.”
Where both branches struggle at the production edge:
- Reliability under long conversation chains. Multi-agent conversations can drift, loop, or produce inconsistent outputs as context windows fill. Guardrails require explicit engineering work — they do not come out of the box.
- Observability. Tracing what happened across five agents during a failed run is harder than it should be. AutoGen 0.4 introduced OpenTelemetry-based tracing as a first-class feature, which is a genuine improvement over v0.2. In practice, instrumentation guidance covers only the lower-level Core API; AgentChat-level abstractions lack dedicated tracing documentation, and multi-component configurations can trigger provider conflicts. For a production build, expect to invest in observability setup beyond what is documented out of the box.
- Error recovery. When an agent step fails mid-workflow — an API timeout, a bad tool call response — neither branch has robust built-in retry and compensation logic. You build that yourself.
- Deployment patterns. AutoGen and AG2 are frameworks, not platforms. There is no built-in scheduler, no managed queue, no deployment target. You bring your own infrastructure.
Compare this against the production-readiness test we apply to all agent frameworks: observability, error recovery, state management, deployment, and security controls. On that rubric, both AutoGen branches score well on the reasoning and conversation layer, and poorly on the operational layer.
That is not a disqualifier. It means you build the operational scaffolding on top — or you use a more opinionated framework that includes it.
Who Should Actually Consider AutoGen or AG2
Good fit:
- Teams already invested in the Microsoft/Azure ecosystem, where Azure AI Foundry Agent Service — built on the Microsoft Agent Framework — adds managed infrastructure around multi-agent workflows
- Research-adjacent applications — internal knowledge synthesis, long-form analysis tasks, complex document review — where the conversational agent model maps naturally onto the workflow
- Python-native development teams who want fine-grained control over agent conversation design and are comfortable building their own deployment layer
- Organizations piloting multi-agent reasoning before committing to a heavier orchestration framework like LangGraph
Poor fit:
- Teams who need a production-ready agent platform with built-in scheduling, observability, and deployment without significant custom infrastructure work
- Projects requiring TypeScript or JavaScript integration — AG2 is Python-only, and the Microsoft Agent Framework supports Python and .NET (C#) but not TypeScript or JavaScript directly
- Simple task automation that does not need multi-agent reasoning — the overhead is not worth it
- Organizations with no Azure footprint who want managed hosting — there is no neutral managed option for AG2
For a broader comparison across Python frameworks, our open-source agent framework shortlist covers how AutoGen/AG2 sits alongside LangGraph, CrewAI, and others. If you are specifically weighing Python-based multi-agent options, CrewAI vs LangGraph is a useful companion read.
The Azure-Specific Case: When Microsoft’s Stack Closes the Gap
One scenario where the Microsoft Agent Framework becomes genuinely compelling: your organization already uses Azure OpenAI Service, Azure AI Foundry, and has Microsoft 365 Copilot deployments in flight. In that context, Azure AI Foundry Agent Service — built on the Microsoft Agent Framework — gives you:
- Managed agent execution with Azure’s compliance and security posture
- Native connectivity to Azure OpenAI models without extra configuration
- Integration with Microsoft identity and access management
- A path toward Copilot Studio for less-technical users to interact with the same agent infrastructure
For a Swiss or European SMB with strict data residency requirements and an existing Microsoft EA agreement, this is not a trivial advantage. Azure’s European data centers and compliance certifications matter, and the integration cost you would otherwise spend on a neutral framework gets partially offset by first-class Azure support.
The caution: you are now on a Microsoft product roadmap, and the rate of change in this stack has been high — three major framework transitions in under three years. What sits beneath Azure AI Foundry Agent Service today may look substantially different in twelve months. Build to interfaces, not to the framework internals.
The Decision Question to Ask Before Choosing Any Framework
Framework selection rarely deserves the strategic weight engineers give it. The more important questions are:
- What does the agent actually need to do, and does the multi-agent conversation model fit that task?
- What infrastructure do we already have, and what operational gaps does the framework create?
- How much of our engineering capacity goes to framework plumbing versus the actual problem we are solving?
AutoGen and AG2 answer question one well for collaborative reasoning tasks. They leave questions two and three largely to you.
That engineering overhead is real. At Orange ITS, framework choice follows use-case and operational requirements — not the other way around. A 50-person professional services firm does not have the same calculus as a fintech with a dedicated MLOps team.
Talk Through Your Agent Build Before You Commit to a Framework
Choosing between AG2, the Microsoft AutoGen lineage, LangGraph, or custom development hinges on your stack, your team’s capacity, and how much operational infrastructure you want to own.
If you are in the evaluation phase, a 30-minute call with the Orange ITS team can save weeks of framework experimentation. We will look at your use case, your existing infrastructure, and give you a direct view of which approach is likely to ship and which is likely to stall.