Testing Skills¶
This guide covers how to write tests for skills in mirai-shared-skills. The repo enforces three test tiers — unit, integration, and Docker-required — each with its own fixtures and pytest markers.
Test layout¶
tests/
├── conftest.py # registry-clearing fixture, shared fakes
├── auth_gates/
│ ├── test_skill.py # unit
│ └── test_credential_handoff.py
├── agentic_rag/
│ ├── test_skill.py # unit (mocked providers)
│ ├── test_token_budget.py # unit (pure function)
│ ├── providers/
│ │ ├── test_azure_search.py # unit + respx HTTP mocks
│ │ └── test_neo4j_graph.py # integration (Docker)
│ └── eval/
│ └── test_harness.py # eval harness, [eval] extra
├── browser/
│ └── test_skill.py # unit + respx
├── ...
Tier 1: Unit tests (no live backends)¶
Run with uv sync --extra dev && uv run pytest tests/. These should be the bulk of your tests.
- For HTTP-using skills: use
respxto mockhttpxcalls. - For provider-using skills (e.g.
AgenticRAGSkill): injectMagicMock(spec=GraphProvider)etc. — never real providers. - For pure helpers (e.g.
estimate_citation_tokens): plainpytestparametrize.
Example: testing WeatherSkill with respx:
import respx
from httpx import Response
from mirai_shared_skills.weather import WeatherSkill
@respx.mock
async def test_weather_lookup():
respx.get("https://api.weather.test/london").mock(
return_value=Response(200, json={"temp_c": 12.5}),
)
skill = WeatherSkill(api_base="https://api.weather.test")
result = await skill._get_current("london")
assert result["temp_c"] == 12.5
Example: testing AgenticRAGSkill with mocked providers:
from unittest.mock import MagicMock, AsyncMock
from mirai_shared_skills.agentic_rag import AgenticRAGSkill
from mirai_shared_skills.agentic_rag.providers import GraphProvider, VectorSearchProvider
async def test_rag_token_budget_drops_low_rank_chunks():
graph = MagicMock(spec=GraphProvider)
graph.expand = AsyncMock(return_value=[/* big chunks */])
vector = MagicMock(spec=VectorSearchProvider)
vector.search = AsyncMock(return_value=[/* small chunks */])
skill = AgenticRAGSkill(graph=graph, vector=vector, token_budget=500)
result = await skill._retrieve("query")
assert sum(c.estimated_tokens for c in result.chunks) <= 500
assert result.dropped_count > 0
Tier 2: Integration tests (@pytest.mark.integration)¶
Run with uv sync --all-extras && uv run pytest -m integration. These tests exercise real provider SDKs against in-memory or local backends.
- Mark with
@pytest.mark.integrationso defaultpytest tests/skips them. - Don't hit production endpoints. Use local backends or recorded fixtures.
Tier 3: Docker-required tests¶
Some integration tests need a running Neo4j or other DB. Use docker-compose.test.yml:
docker compose -f docker-compose.test.yml up -d
uv run pytest tests/agentic_rag/providers/test_neo4j_graph.py -m integration
docker compose -f docker-compose.test.yml down -v
CI runs Tier 1 always, Tier 2/3 nightly (or on a manual workflow_dispatch).
Registry fixtures¶
Skills register descriptors at import time (see ADR-0004). For tests that touch the registry, clear it between cases:
import pytest
from mirai_shared_skills import _registry
@pytest.fixture
def empty_registry():
saved = dict(_registry._REGISTRY)
_registry._REGISTRY.clear()
yield _registry
_registry._REGISTRY.clear()
_registry._REGISTRY.update(saved)
def test_register_and_find(empty_registry):
descriptor = make_fake_descriptor("fake-skill")
empty_registry.register(descriptor)
assert empty_registry.find("fake")[0].name == "fake-skill"
Linting + type checking¶
CI also runs:
Mypy is in --strict mode for the package. Tests are checked too — keep them typed.
Docs build as a test¶
The CI pipeline runs uv run mkdocs build --strict to catch doc breakage. If your skill ships new references or new ADR cross-links, run this locally:
Broken internal links, missing files, or --8<-- snippets that don't resolve all surface as Aborted with N warnings in strict mode!.
Related¶
- ADR-0002: Pluggable Provider Pattern — the abstraction that makes mocking trivial.
- ADR-0004: In-Process Skill Descriptor Registry — registry fixture rationale.
- Provider Configuration guide — the env vars integration tests need.