Jun 18, 20265 min read/2026/06/18/would-an-agent-framework-fix-llm-migration/

Would an Agent Framework Fix It? What Microsoft and Copilot Change About LLM Code Migration — and What They Don't

In my last post I argued that when you point an
LLM at a large legacy migration, the model is the easy part — the hard part is everything around it:
feeding it context, applying its edits, knowing when to stop, respecting framework boundaries, and
measuring progress. A fair question came back almost immediately: what if you don't hand-roll that
harness?
What if you reach for a real agent framework — Microsoft Agent Framework, Semantic Kernel,
Microsoft.Extensions.AI, or the GitHub Copilot SDK? Do the problems disappear?

Short answer: a good framework dissolves some of them outright and relocates the rest. It's worth
being precise about which is which, because it tells you exactly how much work you're actually saving.

What an agent framework genuinely gives you

These frameworks are not thin wrappers. They give you real leverage:

  • Tool / function calling — the model can read files, search the repo, and run the build itself,
    as tool calls, instead of you spoon-feeding everything into a prompt.
  • Orchestration — planner / worker / reviewer become agents with handoffs, group chat, or
    manager-style coordination, instead of your bespoke control flow.
  • Threads and memory — multi-turn context is managed for you.
  • A provider abstractionMicrosoft.Extensions.AI's IChatClient (and the Kernel on top of it)
    lets you swap models without rewriting your pipeline. The GitHub Copilot SDK goes further and is
    already repo-aware and wired to edit code.

The problems that genuinely go away

Two of the worst problems from the last post are artifacts of a hand-rolled, single-shot design, and
a tool-using agent kills them by construction:

  1. The diff trap is gone. When the agent edits files through tools, there is no unified diff to
    hallucinate or fail to apply. The entire category of "the code was fine but the patch wouldn't
    land" simply doesn't exist.
  2. Context starvation is gone. When the model can grep and open files on its own, it stops
    blocking for "I can't see the method I need to call" — it just goes and reads it. You delete the
    machinery you built to resolve and inject dependencies.

If those two were eating most of your engineering time — and in my experience they were — that's a
real, large win. This is the same conclusion I reached last time: for throughput, an agentic engine
beats a structured single-shot pipeline. An agent framework is a clean way to get one.

The problems that do NOT go away

Here's the part the framework marketing won't tell you. These survive intact, because they were never
about the plumbing — they're about the domain and the loop:

  1. The verification gate is still yours. No framework knows that "success" for your migration means
    compiles, and contains zero references to the old framework. The agent can run the build (great),
    but you define what counts as done, parse the errors, and decide pass/fail. That logic is migration-
    specific and you write it.
  2. "When to stop" is still yours. Last post's sharpest lesson — a flat error count across retries
    means the fix is at a higher tier (a package or project change the model can't make from inside a
    source file). An agent loop will happily keep trying forever. The meta-judgment "stop rewriting, fix
    the project" is orchestration you add around the framework; the framework just gives you a loop.
  3. The domain knowledge is still yours. "You can't write target-framework APIs inside a module that
    still targets the old framework — extract a neutral seam instead." No agent framework ships that
    insight. You encode it — as system prompts, rules files, or skills. The framework is the vehicle;
    the migration expertise is cargo you load.
  4. Measurement is still yours — and arguably harder. An agentic run is more of a black box than a
    structured pipeline, by design. If you want to compare two models, build a regression suite, or know
    whether last week's change made things better, you still build that scoring shell yourself. Agentic
    throughput and measurability are not the same axis.
  5. Cost, latency, and determinism are still yours. An agentic loop makes many model calls per file.
    The framework makes that easy to write; it does not make it cheap, fast, or reproducible. Budget
    caps, guardrails, and provenance are on you.
  6. The migration strategy is still yours. Porting bottom-up (leaf dependencies first) so each layer
    compiles against already-migrated code — that's a plan, not a capability. No agent decides it for you.

Microsoft vs. Copilot, briefly

  • The Microsoft stack (Microsoft.Extensions.AI → Semantic Kernel → Agent Framework) is a superb
    substrate: IChatClient, tool calling, agents, orchestration patterns, memory. It hands you the
    engine and the wiring. You still supply the tools that touch your repo and build, the verification
    gate, and the domain rules.
  • The GitHub Copilot SDK / coding agent sits closer to the finish line — it's already code- and
    repo-aware and built to edit and run things. It dissolves the diff/context problems essentially out
    of the box. But the verification gate, the domain rules, and measurement are still yours.

The difference between them is mostly how much of the agentic plumbing is pre-built. Neither one
makes the judgment problems go away, because those aren't plumbing.

The shape that actually works

It's the same hybrid I landed on last time, just with the engine swapped for a framework instead of a
hand-rolled loop: let the framework (or coding agent) be the capable, tool-using worker, and wrap it
in your own shell that owns verification, measurement, cost control, and the domain rules.
Use the
framework for what it's genuinely great at — tools, orchestration, multi-turn — and keep ownership of
the parts that encode what a correct migration is and whether you're getting better at it.

So: would an agent framework have saved me effort? Yes — a meaningful chunk of it, the mechanical chunk.
But it would have moved my work up the stack, from fighting diffs and feeding context to encoding
judgment and measuring outcomes. That's a much better place to be spending your time. It is not,
however, the same as the work disappearing. The framework changes where the hard part lives. It
doesn't change that there is one.