2025-10-25T00:00:00+00:00Z

How to get help from a robot

Oct 25, 2025

Contents

Anatomy of an agent workflow
Put it to use
1. How do I know it’s doing the right thing?

Imagine you’re in a cave.

In every direction, passages and galleries lead away into the dark. There are occasional dots of light in the murky depths, but most of the space around you is steeped in shadow.

You have somewhere to get within this cave system. You have to define a path between where you are and where you want to go. Given the many branches, the darkness, and the sheer ground to cover, this could take a while.

Now imagine a pouch on your hip. Inside it is a robot. You can remove the robot and send it off to help map the space for you. You can define its strategy pretty flexibly, following your existing hunches. You can also ask it to survey the darkness, giving you the lay of the land.

If the robot needs tools to aid its explorations, you can attach them and the robot can use them. If the robot needs information, like existing data about the cave, you can provide it.

And you can deploy this robot over and over until you’ve covered the territory you need to move confidently.

You can have this fantasy today, using so-called “agents” driven by LLMs.

This year, LLM-based agents crossed a threshold in maturity: the technology is now ready to help you solve loads of problems, directly manipulating files and resources on your behalf. They’re comfortable to use and easy to get started with.

If you are outside this trend, that AI detail might give you pause. You’ve heard a lot about LLM-based products like ChatGPT, and not all of it is great. LLMs are sometimes said to be little more than autocomplete, or prone to hallucinations. You’ve heard that LLMs, applied stupidly, have invented everything from fake case law to fake airline cancellation policy.

LLMs, from this perspective, might seem unpredictable and inconvenient.

An ox is willful, and may not always want to plow the field. But combine the ox with a harness and yoke, and now its energy can be reliably focused on a productive task.

An LLM is much like the ox here, and while its output isn’t perfect, its stamina is. An agent yokes an LLM to both other tools and to a workflow that can dramatically improve its correctness. Correctness can improve further, through the strategic introduction of existing information.

This isn’t a tool that replaces humans. Properly applied, it’s a tool that amplifies our imagination, discovery and ambition, creating more leverage for our finite time on this planet.

Most importantly, I think it’s best to see agents as things that explore on your behalf, rather than things that create for you. The agent is best applied as a mechanism for deepening your understanding of a problem, addressing the tedious bits you’d rather not do yourself.

An agent might nonetheless create a ton of output for you! They’re great for prototypes, boilerplate, and creating structures you’ve defined carefully.

But the only way to get output you’re happy and confident with is by really understanding the problem you’re working. Arguments about the usefulness of LLMs really miss this point: agents are incredible tools for figuring out your problems quickly and plumbing their depths efficiently.

Here’s how to use agents: basic concepts, plus some concrete guidance with tools that work.

Anatomy of an agent workflow

Using an agent requires understanding a sandwich of different components, each with their own limitations and leverage. Here’s a quick (and incomplete) survey.

The model

The (large language) model is the motive power behind your agent. Think of it as a lossy, biased, incomplete snapshot of human knowledge and culture.

LLMs are expensive to “train.” It takes time and significant computing resources to create one. The result is something both impressive and quite rigid.

A completed model can process information with a staggering variety of structures, from plain language to any number of coding languages, and even some file formats. It can generate structured information just as well. A surprising quality of LLMs is how flexible they are at interpreting and replicating all kinds of patterns.

But the models themselves aren’t changeable. They’re a brick edifice.

The context

Information is physical: an LLM runs inside systems with physical constraints, with ceilings imposed by things like memory chips. Context describes the memory limit allocated for a model to do its work in.

Context is finite.

By contrast to the model, context is also completely malleable. You can put anything you want in there. Part of the strategy for working with agents is ensuring that your context contains enough information to tilt the model in the direction of productive work. Populating context strategically with examples, reference files, and even instructions can productively bias the model so that it’s more likely to do the thing you want. Many agents give you the tools to add entire files to the context, and this is very handy.

But remember: context is finite. You can’t stuff the world into it. You need to provide just enough to cue the model while leaving room for actually producing work.

Because you’ll use that room for things you say to the model, and things it says back to you.

Unlike the solid brick of the model, the context is more like a sponge. It’s malleable, it can absorb a lot, but it has limits.

The agent

An “agent” is a harness for an LLM. In simple terms, it runs the LLM in a loop, continually prompting it with either automated input, or your own text, so the model can use various tools.

But this harness is extensible. Agents can swap models, allowing them to improve as new models come online.

Moreover, agents have convenient interfaces for getting information in and out of the LLM. A common approach to this is a file editor: the same file you’re looking at is also fed to the model. This can get fancy, even piping in details like which lines you’ve got selected. Click and drag over a passage and you can just talk to the agent as though it can see you gesturing at an object.

These niceties are where an agent differentiates itself. Such a product can earn our allegiance through its reliability, conveniences and ease of use.

The tools

Agents integrate other services using MCP, short for model context protocol. MCP enables populating the context with information from outside your agent. You could pull in spreadsheet data, documentation, you name it. Any existing service can provide an MCP. Even if it doesn’t, you can use existing APIs to build an MCP yourself. Ask an agent to build it for you!

But MCP travels both ways: agents can use MCP to manipulate external resources on your behalf. An MCP could also update a spreadsheet based on your instructions to an agent and the contents of your context.

MCP servers extend the reach of your agent into other products, domains and data sources. Of all these components, MCP may be the most exciting because you might just use it in a way no one has thought of yet. Multiple MCP connections can be combined in one session, letting your agent act as a mixing valve between distinct services and data sources.

Put it to use

My favorite agent tool right now is Claude Code. It runs in your terminal, giving you a robot butler that can do a huge amount of stuff with your projects, and help you solve problems with your tools and computer besides.

Probably the easiest way to start using it is to grab Visual Studio Code and add the Claude Code extension.

You choose the places Claude Code has access to, either by opening a folder in VS Code, or using your terminal to navigate somewhere specific, as in:

cd ~/Documents/my-big-project

Then you just invoke the tool with one command:

claude

From here Claude Code can analyze existing files and projects, building summaries for you—or even for itself, to help in future runs.

You can point Claude Code to specific files by using the @ symbol: you’ll get an auto-completion menu. Adding documentation and other references to your project, then pointing Claude Code in their direction, can dramatically calibrate its output. ¹

With Claude Code running, the /init command builds a reference file based on a deep dive of the whole project.

If you need to come back to an existing conversation, use /resume

Claude Code can also run commands on your behalf. If you’re having problems with version control, or don’t want to learn more than the basics of git, this tool can help you work out the one-off incantations you need for more advanced troubleshooting.

Claude Code is also great at interpreting errors for you. You can paste log spew, or even ask it to diagnose errors from the commands it runs for you.

Connect Claude Code to other tools and experiment. If you love Notion, Airtable or Obsidian, MCP servers exist to let your data feed into Claude Code easily, and for the tool to collaborate with you.

How do I know it’s doing the right thing?

This is the most important skill to develop in working with these tools: getting correctness.

The agent can do a lot, but it will do what you say, not what you mean. So taking some time to really think through your problem, and documenting that thinking for the robot, can be valuable. I like to do this by creating a new file in the project directory. Plain text or Markdown works fine. From here I use writing to think through the problem: why I’m working on something, what I hope to achieve, and the specific approaches I feel are valid.

In another age, this would be a design document or specification. But here your audience is the robot, and perhaps yourself in the future.

You don’t need to be exhaustive. In fact, you can work on this document iteratively, enhancing it as the robot tries things for you and your understanding improves.

If you use Claude Code inside of an editor, you can review changes and new files as they’re generated. You edit them directly as these changes come in, and you can reject directions you don’t like. I’d give this editor/Claude Code combo a try even if you’re not working on code-specific projects. It’s a very handy way to work with all kinds of problems, giving you a tidy menu of files, multiple editing tabs, and your agent right next to it all.

And again, some of the best uses of the agent don’t demand perfect correctness: one-off scripts that query an API, prototypes that validate your thinking and expectations, frameworks and templates built from existing examples… you can get so much from a robot spelunking the depths of computer, reading and writing files, and helping you understand the troubleshooting leverage in front of you.

There’s no guarantee the robot will explore the problem space perfectly, nor create perfect solutions. But you’ve got the same constraints. The robot’s advantage is its speed and stamina. Combine it with your discernment and experience, and find your powers amplified.

Pro-tip: robots can help robots. You can use an LLM research product, like Claude or ChatGPT in “deep research mode” on the web, to gather supporting information for you. These tools can scour message boards for up-to-date information that’s not otherwise collected anywhere. You can then ask them to output this research into a file you can download and include in your project and insert into your Claude Code context. This is really helpful for cutting edge new tech with poor documentation.↩

Learn more

LLMs: The opposable thumb of computing

Amidst all the hype, all the skepticism, all the rending of clothes, I am here to tell you: the LLM is to computing what the thumb is to our hand.