What Is an Agent Skill?
Agent systems are starting to look like software systems.
Instead of writing explicit control flow, we provide a goal, a set of tools, and some guidance, and allow the model to determine the steps.
Developers often introduce what are called “skills” — reusable pieces that extend what an agent can do, how it behaves, or how work is structured.
Across systems, a “skill” may refer to a function, a tool wrapper, a role definition, or an entire workflow. These are not the same thing, but they are often treated as if they were.
One of the more concrete definitions comes from Anthropic, where a skill is described as a folder of files (e.g. SKILL.md and helper scripts) that configures an agent’s behavior. This clarifies how skills are packaged, but leaves open questions about their structure: what kind of skill is being defined, when it should be split, and how it should be scoped for reuse.
A more precise mental model turns out to be useful.
A More Precise Model
A useful way to make “skills” more precise is to separate two questions:
- What does the skill describe?
- How does the skill execute?
What the Skill Describes
There are three common types.
A Persona skill defines how the agent behaves — its tone, expertise, and decision patterns.
A Tool skill defines what the agent can do — typically an API, function, or external capability.
A Workflow skill defines how work is structured — a sequence of steps or coordination across multiple actions.
These are often treated as the same thing, but they are not. They operate at different levels of abstraction.
How the Skill Executes
A more important distinction is how a skill interacts with state.
Some skills are stateless. They analyze input or retrieve information without modifying anything external. These can be retried safely and run in parallel.
Others are stateful. They modify external systems — writing data, sending messages, or triggering actions. These introduce constraints: retries may duplicate effects, ordering may matter, and failures may leave partial state.
Most frameworks treat these cases uniformly. This works in simple examples, but breaks in real systems.
This distinction cuts across all types of skills. A tool that reads data behaves very differently from one that writes it. A workflow that analyzes information is fundamentally simpler than one that publishes results.
Once stateful steps are involved, additional structure becomes necessary. It is no longer sufficient to rely on natural language instructions alone. Issues such as ordering, retries, and partial failure need to be handled explicitly.
A Concrete Representation
So far, the distinction is conceptual. To make it usable, it helps to represent skills in a consistent format.
A simple approach is to add a small set of explicit fields to the SKILL.md YAML header:
---name: PR Reviewtype: workflow # persona | tool | workflowexecution: stateful # stateless | statefuldescription: Review a pull request for security and code quality issues.---## Steps1. Fetch the pull request2. Analyze the code3. Generate feedback4. Post review comments## Side Effects- Writes comments to GitHub- May update review status## Related Skills- Code Reviewer- Post GitHub Comment
This format is intentionally minimal. It is not meant to fully specify execution, but to make the important properties visible.
Why This Helps
Making these properties explicit changes how the system behaves.
The type field clarifies what role each skill plays. Without it, all skills appear as undifferentiated text. A persona, a tool, and a workflow look identical, and the agent must infer how to use them. With it, composition becomes more deliberate. A workflow can be understood as coordinating tools under a given persona, rather than as a flat set of instructions.
The execution field makes the side-effect profile explicit. Without it, reading data and modifying external systems are treated as equivalent operations. With it, the system can distinguish between actions that are safe to repeat and those that require caution.
The combination matters. A workflow marked as stateful signals that it contains steps which change the world. This allows the system to separate analysis from action, introduce checkpoints, and handle failures more predictably.
Without these distinctions, the system remains flexible but ambiguous. With them, it becomes composable and easier to reason about.
Execution and the Role of the CLI
So far, skills are described as structured metadata and instructions. The remaining question is how they are executed.
In practice, most non-trivial skills eventually resolve to operations that interact with external systems: APIs, databases, files, or services. A natural way to expose these operations is through a command-line interface (CLI).
From this perspective, a tool skill can be seen as a thin wrapper around a CLI command. A workflow skill becomes a sequence of such commands, coordinated by the agent.
This makes the execution boundary explicit. In practice, the CLI acts as the execution layer for skills. The model decides what to do; the CLI performs how it is done.
This separation becomes especially important once stateful operations are involved.
A command that reads data can be executed freely. A command that writes data — sending a message, updating a record, or triggering an action — introduces risk. It may have side effects that cannot be undone, or that should not be repeated.
This is where a --dry-run mode becomes essential.
A well-designed CLI command should support running in dry-run mode, where the intended action is computed and displayed, but not executed:
dv post-comment --pr 123 --text "Looks good" --dry-run
Would post the following comment to PR #123:"Looks good"
This serves two purposes. First, it allows the agent to reason about consequences before committing. Second, it introduces a natural checkpoint for human oversight — especially for operations that are irreversible or high-impact.
--dry-run is not an optional feature. It is a direct consequence of distinguishing stateless from stateful skills.
Putting It Together
A skill can be understood along two axes: what it describes, and how it executes.
On the descriptive side, it may define behavior, capability, or coordination. On the execution side, it may be stateless or stateful.
These axes interact. A stateless workflow is fundamentally different from a stateful one. A tool that reads data behaves differently from one that writes it.
Making these distinctions explicit provides a more stable way to reason about how agent systems are structured, how they fail, and how they can be composed.
A Working Model
When defining a skill, two questions are usually sufficient:
- What kind of thing is this?
- What happens if it is run more than once?
The first determines how the skill composes with others.
The second determines how it must be executed.
Together, they form a small but useful model for thinking about agent systems.
Closing
The current generation of agent frameworks provides useful primitives, but the abstractions are still evolving. The concept of a “skill” is one place where a small amount of structure goes a long way.
Making these distinctions explicit does not make systems more rigid. It makes them easier to reason about.
As agent systems become more complex, this tends to matter more.
Appendix: Applying This in Existing Systems
Most current agent frameworks do not natively support explicit type and execution fields. However, these behaviors can be approximated through conventions in system prompts.
A simple approach is to define interpretation rules for how skills should be handled based on their metadata.
## Skill Interpretation RulesWhen you load or invoke any skill, check its frontmatter for two fields: `type` and `execution`.### Handling `type`**`type: persona`**Do not invoke this as a command. Load it as background context that shapes how youbehave for the rest of the session. Apply its tone, expertise, and decision patterns silently.**`type: tool`**Treat this as a discrete capability. Invoke it when the task calls for it and returnits result. No special handling needed.**`type: workflow`**Treat this as a self-contained sequence of steps. Work through them in order. Do notmix steps from other skills into this sequence unless explicitly instructed.### Handling `execution`**`execution: stateless`**Run freely. Retry on failure. No confirmation needed.**`execution: stateful`**Before executing, stop and do two things:1. If the skill or its underlying command supports `--dry-run`, run that first and show the output.2. Summarize what you are about to do and what will change, then ask for confirmation before proceeding.Never skip this checkpoint for stateful skills, even if the task seems straightforward.### Fallback rules- If `type` is missing, use the information in the skill to guess its type.- If `execution` is missing, treat as `stateful` and apply the checkpoint.- If a workflow is stateful, treat all its steps as stateful unless they declare otherwise.
This approach is convention-based rather than enforced, but it provides much of the benefit without requiring changes to the underlying framework.
These patterns become more concrete in systems that treat skills as executable units.
For example, in the deepvista-cli (https://github.com/DeepVista-AI/deepvista-cli) — part of the DeepVista AI stack — skills are exposed through a CLI interface, making ideas like dry-run for stateful operations directly usable.
Following the installation setup also applies a small patch to tools like Claude Code and OpenClaw, causing them to respect these conventions automatically. In practice, this means agents are set up out of the box to interpret skill types correctly and enforce execution constraints such as dry-run for stateful steps — without additional prompting or manual rules.











