ADR-0001: Standard vs Raw Skill Categorization¶
- Date: 2026-05-06
- Authors: Matteo Rizzo
- Status: Accepted
- Approval State: Approved (Approved by: Matteo Rizzo on 2026-05-06)
- Implementation State: Completed
1. Context and Problem Statement¶
mirai-shared-skills ships a heterogeneous catalog of capabilities: some are intrinsically safe (a weather lookup, a PDF text extractor, an HTTP page fetcher), others are intrinsically dangerous (a database driver that can DROP TABLE, a sandboxed subprocess runner that can execute arbitrary shell commands). All of them are exposed as BaseSkill objects and are routable by mirai-agent-core's SemanticRouter without per-tool gating at the package boundary.
If every skill were treated identically, downstream clients would have no signal — at install time or at runtime — about which skills require explicit security policies and which can be plugged in directly. Either the catalog defaults to "everything is dangerous" (and clients pay a wrapping tax for trivial tools), or the catalog defaults to "everything is safe" (and a developer who imports RawDatabaseSkill for a one-off prototype ships a SQL-injection-on-prompt vector to production).
We need a first-class taxonomy that the catalog itself expresses, that downstream clients can introspect, and that the agent-core ADR-0012's SecureSkill wrapper can compose against.
2. Decision Drivers (Forces)¶
- Default-safe vs default-deny tension: A blanket default-deny at catalog level pushes wrapping work onto every client; a blanket default-safe ships dangerous tools by accident.
- Discoverability:
SkillDiscoverySkillenumerates descriptors; the discovery payload should make the security expectation visible before a client wires the skill into an agent. - Static checkability:
SkillCategoryshould be aLiteral[...], not free-form text, somypy --strictcatches typos. - Runtime-enforceable: Downstream orchestration code can scan descriptors and fail closed if any
rawskill is registered without aSecureSkillwrapper. - Symmetric with agent-core:
agent-coreADR-0012 already establishes theSAFE/REQUIRES_HITL/BLOCKEDper-tool taxonomy. The catalog-level taxonomy should complement — not duplicate — that runtime taxonomy.
3. Considered Options¶
- Option 1: Untagged catalog (status quo before this ADR). Every skill is just a
BaseSkill; downstream clients eyeball the source to decide what to wrap. - Option 2: Per-skill ad-hoc flags. Each skill class declares its own
is_dangerous: boolor similar. No central taxonomy. - Option 3: Two-tier descriptor-level categorization (
standard/raw). Catalog-levelSkillDescriptor.categoryis one of two literal strings.standardskills are safe to attach directly;rawskills MUST be wrapped inSecureSkillby the downstream client before reaching an agent. - Option 4: N-tier (
safe/requires_review/requires_hitl/blocked). Mirror agent-core's runtime taxonomy at the catalog level.
4. Decision Outcome¶
Chosen option: Option 3 (two-tier standard / raw), because it surfaces a clear, binary intent at install/discovery time without re-litigating per-tool security decisions that already belong to SecureSkill at runtime.
The catalog defines:
# mirai_shared_skills/_registry.py
from typing import Literal
SkillCategory = Literal["standard", "raw"]
And every descriptor declares its category:
SkillDescriptor(
name="execution-debugging",
category="raw", # <-- this descriptor refuses to plug in unwrapped
import_path="mirai_shared_skills.execution.ExecutionDebuggingSkill",
...
)
Semantics:
standardskills are safe to compose withmirai-agent-coredirectly. Their tools either have no side effects, or their side effects are bounded by the skill's own implementation (e.g.WeatherSkillonly reads a public API,PdfExtractionSkillonly parses bytes the agent already has). Astandardskill's documentation MAY still recommend wrapping withSecureSkillfor defense in depth, but the catalog does not require it.rawskills ship powerful primitives that an autonomous agent should not invoke without a per-toolSecurityLevelpolicy. Their documentation MUST state the wrapping requirement, and the canonical worked example isExecutionDebuggingSkill(subprocess execution in an isolated sandbox dir) andRawDatabaseSkill(arbitrary SQL execution). Downstream linters MAY refuse to ship if arawskill is registered without aSecureSkillwrapper.
The taxonomy is intentionally coarser than agent-core's per-tool SecurityLevel. Per-tool decisions (SAFE vs REQUIRES_HITL vs BLOCKED) are client concerns: which database tables are HITL'd depends on which tables exist in that client's schema. The catalog can't know that. What the catalog can know is whether a skill exposes primitives that any client will need to gate.
4.1. Validation / Compliance¶
mypy --strictenforcesSkillDescriptor.category: SkillCategory— no free-form strings reach production.- The descriptor registry test suite asserts every registered descriptor has a non-default
category. - Documentation lint: every skill page in
docs/skills/must include a "Security considerations" section that either (forstandard) cross-links this ADR, or (forraw) states theSecureSkillwrapping requirement explicitly.
5. Pros and Cons of the Options¶
Option 1: Untagged catalog¶
- Pros: Zero design overhead.
- Cons: Every downstream developer rediscovers the safety profile of every skill. Newer team members ship dangerous tools accidentally.
Option 2: Per-skill ad-hoc flags¶
- Pros: Each skill author owns their own security signal.
- Cons: No central enumeration; downstream linters must know every skill's flag name. No
mypyenforcement of valid values.
Option 3 (chosen): Two-tier descriptor-level categorization¶
- Pros:
- Single source of truth, queryable via
SkillDescriptor.category. - Coarse enough to be a stable contract (
rawdoesn't need to subdivide as new dangerous tools appear). - Composes cleanly with
SecureSkill's per-tool taxonomy: the catalog says "this needs wrapping", the wrapper says "and here's how each tool inside it is gated". mypy --strict-friendly.- Cons:
- Requires a per-skill judgment call when authoring new skills — borderline cases (e.g.
AgentBrowserSkill, which fetches arbitrary URLs) need a documented rationale in the skill's ADR or page.
Option 4: N-tier mirroring agent-core's runtime taxonomy¶
- Pros: Symmetry — one vocabulary across catalog and runtime.
- Cons: Requires the catalog to make decisions it can't actually make (
REQUIRES_HITLfor which concrete tools depends on the client). Premature precision.
6. Consequences¶
- Positive Consequences:
- Discoverability via
find-skills: the descriptor'scategoryfield is part of the discovery payload, so an agent or developer can ask "whichrawskills are registered?" and immediately see them. - Downstream clients can write a one-line guard:
assert all(d.category == "standard" for d in undeclared_skills). - The two
rawskills today (ExecutionDebuggingSkill,RawDatabaseSkill) are not the only dangerous primitives the catalog will ever ship; the taxonomy gives future authors a place to land without further design work. - Negative Consequences / Trade-offs:
- Two-tier coarseness: a
standardskill that turns out to be dangerous in some client contexts (e.g.AgentBrowserSkillagainst an internal-only network) must rely on the documentation's "Security considerations" section rather than a catalog-level signal. We accept this trade-off because finer categories at the catalog level are not actionable. - Risks & Mitigations:
- Risk: A new author marks a dangerous skill
standardby oversight. Mitigation: PR template requires authors to justify the category in the ADR for any net-new skill; existingrawskills are the canonical reference cases.
7. Implementation Plan & Status Updates¶
- Target Milestone/Release: v0.1.0 (current).
- Implementation Notes:
- 2026-05-06: ADR formalizes the existing two-tier split.
SkillCategory = Literal["standard", "raw"]andSkillDescriptor.category: SkillCategoryare already inmirai_shared_skills/_registry.py. No code changes required; this ADR documents intent.
8. References / Related Documents¶
mirai_shared_skills/_registry.py—SkillCategory,SkillDescriptor.mirai_shared_skills/__init__.py—bootstrap()populates each descriptor'scategory.mirai_shared_skills/execution/skill.py— canonicalrawskill (sandboxed subprocess).mirai_shared_skills/database.py— secondrawskill.- agent-core ADR-0012: Declarative Security Policies via Skill Wrappers — the runtime per-tool taxonomy this ADR composes with.