Where's the agent debugger?
A computer is a zoetrope: an illusion of persistence and coherence, built on untold numbers of individual frames of detail.
Even as you read this, beneath the surface of your machine endless lines of code zoom past, at rates so extraordinary the human mind can’t comprehend them. Every second, a modern computer churns through billions of cycles.
This is great when everything is working.
But in the history of everything that works, there’s a time where it didn’t. And to get a piece of code out of that place, you’ll need to do some debugging.
Every crash a crime scene
When your program crashes, something did it. Something is responsible. Something came smashing into the assumptions that all of your code is relying upon.
Sometimes it’s simple stuff: we’re missing a value that we expected to exist, and we’ve written no code to handle its absence. Or, we’ve tried to grab the third member of a collection that has just two objects.
Sometimes it’s much messier: trying to touch memory that no longer belongs to us. Trying to use an object that no longer exists. All of these are violations of the simple contract for reality that our code is built around.
Its universe shattered, the program has no choice but to end.
Other bugs are less catastrophic: a mysterious, transient freeze. Something happens twice that should only happen once. Something is missing that really should be present.
Building software is an endless soap opera of whodunnits where the detective and the perpetrator are often the same person: the luckless programmer trying to make sense of many stacked layers of opaque but powerful computing abstraction.
It’s hard to reason about so many interlocking parts moving so fast. Past a certain level of complexity, you just can’t keep it all inside your head.
So, somehow, we have to investigate, shining light into black boxes of our own creation.
Cowboy style
The easiest way to debug code you control is to add logging.
print("Here's the part of the code I really care about. It happened!")
When the program runs, such log messages print out in a terminal with a timestamp.
To continue the zoetrope analogy, this is tagging a sticky note on a specific frame and then observing whether and how often you see that sticky later in the program’s run.
You can get really far with this approach: depending on how much logging you add, you can develop a clear sense of what’s happening with your program at any given moment.
But there are also drawbacks: logs can only answer questions that you think of in advance. Worse, the more logging you add, the harder it is to find the one line you actually care about at any given moment.
Logs are a way of reflecting the internal state of a program, but all they can ever be is a one-way firehose of information. Sometimes that isn’t enough.
The interactive debugger
A debugger hands you the reins on the galloping horse that is your program. Instead of following a dizzying path of instructions after they happen, the debugger is an invitation to intervene.
With a debugger, the zoetrope can stop altogether. Your hand is on the wheel, advancing it at whim. You can move through your program line by line, inspecting which code is being run. Every scrap of data in scope for that code is visible too, allowing you to reason through why something is breaking or behaving unexpectedly.
Instead of the processor dictating the pace, you control reality. It’s essentially the power of Neo in The Matrix: you can slow time and act upon objects inside the system.
Even the ones that move very fast.
An experienced code detective traps their quarry using breakpoints: flags in the program that tell the debugger to stop at a specific place. If your gut says that your crash has something to do with making a network request, you might break at the point where the request is triggered, and break again at the point where the result is written to disk.
Breakpoints allow the program to whisk you directly to the crime scene. There’s little upside to reviewing all the lines of code you know are working correctly. Instead, you arrive briskly to the area of your investigation again and again, run after run, testing changes and reviewing their consequences.
Of all the programming skill sets, I think debugging may be the most consequential. Not just understanding the tool, but developing an intuition for how to use it. What is opaque from the outside can become obvious when you step through the code line by line. Interactive debugging gives you more control, more information.
This is even more important in the age of machine-written code. So far, most people’s robots can’t do this at all.
What if the robot did it
In the past, you were the likely perpetrator of your code crimes.
Today, you might also have a robot accomplice. LLMs can extrude code at a dizzying pace, but all of it requires checking and error correction.
Type systems and linters are a powerful first line of defense against machine errors. If a symbol or function is out of date or simply does not exist, feedback to that effect can be immediately reported, compelling a coding agent to re-work its output, look things up, and otherwise correct itself.
If your next step is a compiler, you get even more error correction fodder. A compiler provides (somewhat) clear feedback about where a problem exists: the file, line number, and the broken expectation. Again, this gives an agent plenty to work with. Compiler output provides an agent with more leads, like which libraries it needs to examine more carefully.
But other errors are more subtle, emerging only at runtime: say, a function in one thread that touches memory in another.
For such errors, the fastest way to track them down is to fire up the debugger and step through the code.
But agents don’t know how to do that. Agents debug cowboy-style, shitting logs all over your program and making you paste back what they see. It’s crude stuff.
There’s no reason it needs to work this way. The consequential information from a debugger could be piped into an agent’s context, and it could navigate the program much like a human developer: setting breakpoints, stepping over and into code. From these investigations, the agent could propose fixes and architectural improvements.
When such a tool enters common use, on the scale of Claude Code or Cursor, that’s going to be a leap forward in the trustworthiness and effectiveness of agent coding systems. Not to mention their usefulness.
Beyond checking and improving its own output, an agent that could walk you through program execution, explaining why things work the way they do, would be powerful indeed.
But, peering as they do inside of running code, debuggers can be used for lots of things. The security implications of such a tool are surely complicated as well. What if you could set an agent loose on cracking someone’s serial number registration code?
Life in a paradigm shift, man.