Skip to content

ADR-0004: In-Process Skill Descriptor Registry

  • Date: 2026-05-06
  • Authors: Matteo Rizzo
  • Status: Accepted
  • Approval State: Approved (Approved by: Matteo Rizzo on 2026-05-06)
  • Implementation State: Completed

1. Context and Problem Statement

SkillDiscoverySkill is the catalog's introspection capability — an autonomous agent calls list_skills() or search_skills(query) to discover what's available, then load_skill_instructions(name) to read the skill's instructional payload (see ADR-0008). For this to work, the agent needs a runtime view of every registered descriptor: name, category, import path, tools, references.

There are several plausible places to keep that view: a YAML file shipped alongside the package, a filesystem scan that walks mirai_shared_skills/ and reads each skill.py, or an in-memory dict populated at import time. Each has trade-offs around startup cost, runtime cost, mutability, and packaging complexity.

We need a registry that's cheap to query (the discovery skill is called every turn the agent runs), zero-config for clients (no extra YAML to ship), and statically checkable (descriptors are typed Pydantic-ish dataclasses, not free-form dicts).

2. Decision Drivers (Forces)

  • Zero runtime discovery overhead: list_skills() is on the hot path every turn the router considers it.
  • Single source of truth: The same descriptors that find-skills returns are also the ones agent-core's tooling can consult.
  • Import-time cost is acceptable: mirai-shared-skills is imported once per process; an O(N) bootstrap pass is fine.
  • No external state: No YAML, no JSON, no SQLite. The registry should require nothing beyond import mirai_shared_skills.
  • Static typing: SkillDescriptor is a typed dataclass; mypy must enforce the shape.
  • Test-friendliness: Tests should be able to construct an empty registry, register a fake descriptor, and assert lookup works — without monkey-patching globals.

3. Considered Options

  1. Option 1: YAML manifest in mirai_shared_skills/catalog.yaml. Loaded at import time, parsed with PyYAML.
  2. Option 2: Filesystem scan. At import, walk subpackages and introspect each skill class.
  3. Option 3: Module-level dict populated by an explicit bootstrap([SkillDescriptor, ...]) call in __init__.py (chosen).
  4. Option 4: Decorator-based registration (@register_skill on each class).
  5. Option 5: importlib.metadata entry points.

4. Decision Outcome

Chosen option: Option 3 (explicit bootstrap() call from __init__.py), because it keeps registration visible — every descriptor that ships is enumerated literally in mirai_shared_skills/__init__.py — and avoids the "where did this skill come from?" debugging tax that decorators and entry points impose.

The implementation lives in mirai_shared_skills/_registry.py:

@dataclass(frozen=True, slots=True)
class SkillDescriptor:
    name: str
    description: str
    instructions: str
    category: SkillCategory          # see ADR-0001
    import_path: str
    tools: tuple[str, ...]
    references: tuple[str, ...]
    # ... etc

# Module-level state.
_REGISTRY: dict[str, SkillDescriptor] = {}

def register(descriptor: SkillDescriptor) -> None: ...
def get(name: str) -> SkillDescriptor: ...
def find(query: str) -> list[SkillDescriptor]: ...
def all_descriptors() -> tuple[SkillDescriptor, ...]: ...
def bootstrap(descriptors: Iterable[SkillDescriptor]) -> None: ...

mirai_shared_skills/__init__.py calls bootstrap([...]) with a literal list of every descriptor at import time:

bootstrap([
    SkillDescriptor(name="find-skills", category="standard", ...),
    SkillDescriptor(name="execution-debugging", category="raw", ...),
    # ... etc
])

SkillDiscoverySkill queries the registry through find() / all_descriptors(). Tests construct fresh descriptors and call register() directly; a pytest fixture clears _REGISTRY between tests so global state doesn't leak.

4.1. Validation / Compliance

  • mirai_shared_skills/__init__.py is the single place new descriptors are added.
  • Importing mirai_shared_skills populates the registry exactly once; subsequent bootstrap() calls in tests must be paired with a clear.
  • mypy --strict enforces the descriptor's shape at every register call site.

5. Pros and Cons of the Options

Option 1: YAML manifest

  • Pros: Editable without code changes.
  • Cons: Drift risk between YAML and code; no static checking; an extra file to ship and parse.

Option 2: Filesystem scan

  • Pros: Self-discovering — no manual list to maintain.
  • Cons: Import-time cost grows with package size; magic; hard to test deterministically.

Option 3 (chosen): Explicit bootstrap() from __init__.py

  • Pros:
  • Every shipped descriptor visible in one file (__init__.py).
  • Static checking via dataclass + mypy.
  • Trivial test fixtures (registry.bootstrap([fake])).
  • Zero runtime cost after import.
  • Cons:
  • Adding a new skill requires editing __init__.py (in addition to the skill subpackage). One extra step.

Option 4: Decorator-based

  • Pros: Co-locates registration with the class.
  • Cons: Side effects on import; debugging "why isn't my skill registered?" requires understanding decorator order.

Option 5: Entry points

  • Pros: External packages can register their own skills.
  • Cons: Premature; this is an internal-shared catalog.

6. Consequences

  • Positive Consequences:
  • import mirai_shared_skills is the only setup. No bootstrap call in user code.
  • find-skills always returns the registered set — no race between "skill class exists" and "descriptor exists".
  • Test isolation: registry._REGISTRY.clear() + bootstrap([...]) per fixture.
  • Negative Consequences / Trade-offs:
  • Two-place edit when adding a skill (subpackage + __init__.py descriptor list). PRs touching one without the other will miss discoverability — caught by integration tests that assert every public skill class has a descriptor.
  • Risks & Mitigations:
  • Risk: A skill class is published but its descriptor is missing from __init__.py. Mitigation: a registry-completeness test enumerates mirai_shared_skills.__all__ and asserts every name has a descriptor.

7. Implementation Plan & Status Updates

  • Target Milestone/Release: v0.1.0 (current).
  • Implementation Notes:
  • 2026-05-06: ADR formalizes the pattern already implemented in _registry.py and __init__.py. No code changes.