Most people are still treating AI agents like better chat windows.
Ask a question. Get an answer. Fix the answer. Ask again.
That works when the job is small.
It breaks the moment the work starts looking like real software work: planning, implementation, review, failed tests, pull request comments, merge decisions, cleanup, and the awkward moment where nobody is sure what should happen next.
At that point the prompt is not the interesting unit anymore.
The loop is.
That is the term I keep coming back to: loop engineering.
Loop engineering is the practice of designing the system that decides who does what, in what order, with which checks, and how the work recovers when something goes wrong.
You stop asking, “What should I tell this agent?”
You start asking, “What kind of team should exist around this work?”
The abstraction layer moved up
Traditional prompting is a one-person conversation.
- Give an agent a task
- Wait for the result
- Correct the result
- Repeat
Loop engineering feels more like standing in front of a project board with a small software company behind you.
- Give an orchestrator an outcome
- Tell it how work should be split
- Define which specialists should be involved
- Require verification, review, and retry loops
- Watch for bottlenecks and redirect the system
The work is no longer just writing the right prompt.
The work is deciding how the machine team should behave.
Who plans?
Who builds?
Who reviews?
Who verifies?
Who gets pulled in when the backlog starts to smell wrong?
This is why a lot of AI-assisted development now feels less like using a tool and more like running a small team. It also fits with what I wrote in agentic-first development: if software is going to be built and operated by agents, then the interfaces, scripts, workflows, and state layers need to be designed for agent use from the beginning.
Loop engineering is the management layer sitting on top of that.
The prompt is becoming a staffing request
This is the part that took me a while to internalize.
In a serious multi-agent setup, you are often not writing the final prompt anymore.
Your orchestrator is.
You give it the objective, the constraints, the roles, the review standards, and the rules of engagement. Then it writes or assembles the task instructions for the downstream agents.
The prompt starts to look less like handcrafted prose and more like generated infrastructure.
The trend I am seeing in my own testing is simple: the useful prompt is less about creating the perfect instruction and more about building the right team.
It starts sounding like this:
We have a bottleneck on code review. Can we get agents who specialize in code review and assign them to the current pull requests on GitHub? Scale two of them to work through the backlog of outstanding pull requests.
That is a very different kind of instruction.
I am not telling an agent how to review one pull request.
I am telling the orchestrator the team has a constraint. Code review is backed up. Add capacity there.
That is the shift.
The old question was:
What should I ask the model to do?
The new question is:
What system should decide who does what, in what order, with which checks, and how should that system recover when something goes wrong?
That is a much more interesting problem.
It also explains why single-agent prompting starts to feel clumsy on real product work. A feature request is rarely one task. It is research, planning, decomposition, implementation, code review, testing, fixes, merge decisions, and status updates.
Trying to stuff all of that into one long chat with one agent is possible.
It is also a mess.
Treating it as a managed loop is cleaner.
This is where adversarial agents stop being a novelty and start becoming part of the operating model. One agent builds. Another reviews. Another challenges assumptions. Another verifies behavior.
The quality comes from the loop, not from blind trust in any one response.
The orchestrator needs process knowledge
A good orchestrator does more than forward tasks.
It needs a working theory of how software development flows.
At minimum, it should know how to:
- read an epic or feature request
- split it into tractable tasks
- choose the right specialist for each task
- preserve shared context across those tasks
- collect outputs and decide what happens next
- send work to review before merge
- reopen work when review or tests fail
- keep project state visible somewhere outside the chat
In other words, it needs process knowledge.
That is why AI-operable systems matter so much. If your repo, scripts, admin tools, and conventions are not legible to agents, the orchestrator has nothing reliable to work with.
Loop engineering sits on top of operability.
It does not replace it.
Here is the kind of instruction packet that becomes more useful than a normal prompt:
Take this feature request and break it into implementation tasks.
Assign:
- one agent for task splitting and dependency ordering
- one or more coding agents for implementation
- one reviewer agent for code review and regression risk
- one verification agent to run tests and validate behavior
Rules:
- no task merges without review
- failed verification loops back to implementation
- update GitHub issue and project state after each stage
- stop and escalate if token burn rises without meaningful progress
- deprioritize work that is no longer aligned with the active goal
That is not a prompt in the old sense.
It is a miniature operating model.
What this looks like in practice
The environment I have been testing uses an orchestrator agent as the top-level manager for the project.
It does not take a feature request and heroically do everything itself.
It delegates.
One loop splits the work. Another implements. Another reviews. Another handles rework when the pull request is not good enough. The orchestrator keeps the whole thing moving.
That only works because there is a real workflow underneath it.
The agents are not floating around in a blank chat window. They are working through feature branches, GitHub pull requests, review comments, test runs, merge decisions, and cleanup steps in the local development environment.
The deterministic parts still matter.
Git creates the branch.
GitHub holds the pull request.
Tests pass or fail.
Review comments become change requests.
The agents operate inside that machinery instead of replacing it with vibes.
GitHub issues and project boards become the visible state layer. They are the control room wall. If work is queued, blocked, reviewed, ignored, or ready to merge, it needs to show up somewhere other than a chat transcript.
Without that, the whole system turns into expensive fog.
With it, the project starts to feel manageable.
That is the practical appeal of loop engineering. One person gets a higher-level control surface over a lot of moving parts. You are no longer inside every implementation detail all day. You are steering the system that produces the implementation details.
In my own experiments, projects operating this way already move differently. Features get revisited quickly. Bugs surface earlier. Small regressions get cleaned up before they harden into architecture. Work keeps moving without waiting for one person to manually push every step forward.
You feel the difference when the loop is visible.
Shipping, reviewing, revisiting, and fixing can all happen at once.
Why the speed jumps
There are four reasons this accelerates development.
1. Parallelism becomes real
Most solo builders can only hold one serious thread of implementation at a time.
An orchestrated agent system can keep multiple threads alive:
- one agent implementing
- one reviewing
- one researching
- one fixing a failed check
- one updating project state
That changes throughput immediately.
2. Review moves closer to the work
In a manual workflow, review often waits until a chunk of work feels done.
In a loop-engineered workflow, review can start the moment the work lands. A specialist reviewer can inspect the change. A verification agent can try to break it. The correction cycle tightens.
3. Small fixes stop feeling expensive
A lot of issues stay unfixed because they are annoying, not because they are hard.
With delegation loops, it becomes easy to say:
Take this rough edge, validate it, patch it, run the checks, and send it back through review.
The friction drops.
The cleanup actually happens.
4. The human stays at the right altitude
This may be the biggest one.
If the human operator spends all day rewriting prompts and shepherding each micro-step, the system never compounds.
If the human stays at the orchestration layer, they can steer more work with the same attention.
That is the real abstraction gain.
The human role gets more important
This is not “set the agents loose and disappear.”
It still needs oversight.
More oversight, not less.
The job starts to look less like typing code and more like running a technical organization made of machines.
That means:
- setting priorities
- deciding what not to work on
- watching token burn
- spotting loops that are reinforcing the wrong behavior
- watching for bottlenecks in the project board or pull request queue
- adding reviewer, planner, or implementation capacity when work piles up
- deciding when to merge, pause, or kill a thread of work
- keeping the project aligned with the actual business goal
This is where bring your own agent gets more interesting too. The real leverage is not just access to a model. It is having a private operating system for delegation, context, evaluation, and control.
Some agents work on the planning side instead of the coding side.
Take an epic. Split it into tasks. Review the architecture. Check whether the plan still matches the overall direction of the project. Decide what should be ignored for now.
Other agents work the delivery side.
Review this pull request. Run the tests. Patch the change request. Clean up the local branch. Update the project board. Move the next item.
The human is still in the loop, but the job changes.
You watch how the agent team is running.
Is review piling up? Add review capacity.
Are implementation agents producing work that does not fit the architecture? Strengthen the planning loop.
Is the merged app drifting away from the product goal? Stop the work and redirect it.
This is not yet the stage where you turn off the monitor, walk away, and come back to a finished project. That is how you waste money and time on an agent team doing the wrong things very quickly.
The better posture is active supervision.
Keep the project board open. Keep a local version of the merged result running. Watch for the parts of the workflow that are backing up or drifting. Then improve the agent team around those constraints.
The models do the labor.
The loop engineer designs the labor system.
Where this is going
I do not think this stays as a terminal-only niche for long.
The next wave of tooling will make these loops more visible:
- dashboards for agent status
- org-chart views of active specialists
- live token burn monitoring
- queue management for pending work
- review gates and merge policies for agent output
- maps of what each agent is doing right now
Once that layer exists, the mental model gets much easier to grasp.
You are not prompting an AI.
You are operating a delegated work system.
That is why loop engineering feels like a useful term. It captures the move away from one-off instructions and toward repeatable, inspectable, self-correcting loops.
The individual prompt still matters.
It is just no longer the main thing.
The main thing is designing a system where agents can coordinate, review, recover, and keep making progress without needing a human to push every domino by hand.
That is a new layer of software work.
The people who learn to run it well will not just write better prompts.
They will run better machine teams.
