M6: OpenClaw Integration

2026-02-20T23:37:07Z by Showboat 0.6.0

What we are testing

M6 replaces shell agent simulations (agent-sim.sh) with real OpenClaw LLM agents. The key new component is agent-openclaw.sh — a dispatcher that reads WAKE lines, looks up agent mappings in agents/agent-map.json, and triggers OpenClaw agents via openclaw agent --message.

Since we don't have a live OpenClaw gateway in this demo, we focus on --dry-run mode (proving correct CLI construction) and test results (proving the mock-based integration works).

Agent map configuration

The label-to-agent mapping lives in agents/agent-map.json. Let's inspect it.

cat agents/agent-map.json

{
  "dev": {
    "agentId": "b4-dev",
    "actor": "dev-agent"
  },
  "dev-frontend": {
    "agentId": "b4-dev",
    "actor": "dev-agent"
  },
  "dev-backend": {
    "agentId": "b4-dev",
    "actor": "dev-agent"
  },
  "eng-mgr": {
    "agentId": "b4-eng-mgr",
    "actor": "eng-mgr-agent"
  }
}

Four label mappings: dev, dev-frontend, and dev-backend all route to the same b4-dev agent (with actor dev-agent). The eng-mgr label routes to b4-eng-mgr. Adding new roles requires only a JSON entry — zero code changes.

Creating test beads

Let's create beads with different labels to test the dispatcher routing.

export BEADS_DIR=$(pwd)/beads/.beads && bd create "Implement login validation" --labels dev -d "Add email format check to login form" --json

warning: beads.role not configured. Run 'bd init' to set.
{
  "id": "beads-4z0",
  "title": "Implement login validation",
  "description": "Add email format check to login form",
  "status": "open",
  "priority": 2,
  "issue_type": "task",
  "owner": "test@b4arena.dev",
  "created_at": "2026-02-20T23:37:53.775045Z",
  "created_by": "b4arena",
  "updated_at": "2026-02-20T23:37:53.775045Z"
}

export BEADS_DIR=$(pwd)/beads/.beads && bd create "Triage: unclear requirements" -d "No labels, should route to eng-mgr" --json

warning: beads.role not configured. Run 'bd init' to set.
{
  "id": "beads-kz0",
  "title": "Triage: unclear requirements",
  "description": "No labels, should route to eng-mgr",
  "status": "open",
  "priority": 2,
  "issue_type": "task",
  "owner": "test@b4arena.dev",
  "created_at": "2026-02-20T23:37:59.496942Z",
  "created_by": "b4arena",
  "updated_at": "2026-02-20T23:37:59.496942Z"
}

Watcher output

The watcher (unchanged from M5) routes beads by label.

export BEADS_DIR=$(pwd)/beads/.beads && bash scripts/beads-watcher.sh

WAKE backend-dev beads-53o
WAKE dev beads-4z0
WAKE frontend-qa beads-ml6 beads-mx2
WAKE eng-mgr beads-kz0 beads-qfd

Our new bead beads-4z0 appears as WAKE dev. The unlabeled bead beads-kz0 routes to eng-mgr. Some older beads from previous milestones are also visible.

Dispatcher dry-run

Now the key M6 component: agent-openclaw.sh in --dry-run mode. This shows what openclaw CLI commands would be executed without actually calling the LLM.

export BEADS_DIR=$(pwd)/beads/.beads && bash scripts/beads-watcher.sh | bash scripts/agent-openclaw.sh --label dev --dry-run

DRY-RUN: openclaw agent --agent b4-dev --message <msg> --json --timeout 120
  bead_id=beads-4z0 actor=dev-agent

The dispatcher correctly:

Filtered WAKE lines to only dev (ignoring backend-dev, frontend-qa, eng-mgr)
Mapped label dev → agent b4-dev with actor dev-agent
Would call openclaw agent --agent b4-dev --message <msg> --json --timeout 120

Let's also try without a label filter — all mapped labels get dispatched.

export BEADS_DIR=$(pwd)/beads/.beads && bash scripts/beads-watcher.sh | bash scripts/agent-openclaw.sh --dry-run 2>&1

Warning: no agent mapping for label 'backend-dev', skipping
DRY-RUN: openclaw agent --agent b4-dev --message <msg> --json --timeout 120
  bead_id=beads-4z0 actor=dev-agent
Warning: no agent mapping for label 'frontend-qa', skipping
DRY-RUN: openclaw agent --agent b4-eng-mgr --message <msg> --json --timeout 120
  bead_id=beads-kz0 actor=eng-mgr-agent
DRY-RUN: openclaw agent --agent b4-eng-mgr --message <msg> --json --timeout 120
  bead_id=beads-qfd actor=eng-mgr-agent

Without a filter, the dispatcher processes all WAKE lines:

dev → dispatches to b4-dev (1 bead)
eng-mgr → dispatches to b4-eng-mgr (2 beads, serially)
backend-dev and frontend-qa → warning: no mapping (skipped gracefully)

Labels without a mapping in agent-map.json are safely skipped with a warning. Beads are processed serially because Dolt (bd's backend) is not safe for concurrent access.

Test results

The M6 acceptance tests use a mock openclaw script to verify the dispatcher without requiring a live LLM.

uv run pytest tests/test_m6_openclaw_integration.py -v 2>&1 | tail -25

rootdir: /Users/mhild/src/durandom/openclaw/b4arena
configfile: pyproject.toml
collecting ... collected 19 items

tests/test_m6_openclaw_integration.py::TestDispatcherWakeParsing::test_dispatches_single_bead PASSED [  5%]
tests/test_m6_openclaw_integration.py::TestDispatcherWakeParsing::test_dispatches_multiple_beads_same_label PASSED [ 10%]
tests/test_m6_openclaw_integration.py::TestDispatcherWakeParsing::test_dispatches_multiple_wake_lines PASSED [ 15%]
tests/test_m6_openclaw_integration.py::TestDispatcherWakeParsing::test_ignores_non_wake_lines PASSED [ 21%]
tests/test_m6_openclaw_integration.py::TestDispatcherLabelFilter::test_label_filter_processes_only_matching PASSED [ 26%]
tests/test_m6_openclaw_integration.py::TestDispatcherLabelFilter::test_compound_label_filter PASSED [ 31%]
tests/test_m6_openclaw_integration.py::TestDispatcherAgentMapping::test_missing_label_mapping_warns PASSED [ 36%]
tests/test_m6_openclaw_integration.py::TestDispatcherAgentMapping::test_missing_agent_map_file_fails PASSED [ 42%]
tests/test_m6_openclaw_integration.py::TestDispatcherMessage::test_message_contains_bead_id PASSED [ 47%]
tests/test_m6_openclaw_integration.py::TestDispatcherMessage::test_message_contains_beads_dir PASSED [ 52%]
tests/test_m6_openclaw_integration.py::TestDispatcherMessage::test_message_contains_actor PASSED [ 57%]
tests/test_m6_openclaw_integration.py::TestDispatcherDryRun::test_dry_run_does_not_call_openclaw PASSED [ 63%]
tests/test_m6_openclaw_integration.py::TestDispatcherDryRun::test_dry_run_shows_agent_id PASSED [ 68%]
tests/test_m6_openclaw_integration.py::TestDispatcherErrorHandling::test_requires_beads_dir PASSED [ 73%]
tests/test_m6_openclaw_integration.py::TestDispatcherErrorHandling::test_openclaw_failure_warns_and_continues PASSED [ 78%]
tests/test_m6_openclaw_integration.py::TestDispatcherJsonOutput::test_agent_json_output_on_stdout PASSED [ 84%]
tests/test_m6_openclaw_integration.py::TestAgentMapConfig::test_default_agent_map_is_valid_json PASSED [ 89%]
tests/test_m6_openclaw_integration.py::TestAgentMapConfig::test_default_agent_map_has_required_fields PASSED [ 94%]
tests/test_m6_openclaw_integration.py::TestAgentMapConfig::test_default_agent_map_has_dev_entry PASSED [100%]

============================= 19 passed in 18.92s ==============================

All 19 M6 tests pass. Test coverage:

Category	Tests	What's verified
WAKE parsing	4	Single/multiple beads, multiple lines, non-WAKE ignored
Label filter	2	Single label, compound label (sorted+joined)
Agent mapping	2	Missing mapping warns, missing file errors
Message content	3	Bead ID, BEADS_DIR, BD_ACTOR in message
Dry-run	2	No openclaw call, shows agent ID
Error handling	2	Missing BEADS_DIR, CLI failure continues
JSON output	1	Agent response passed through
Config validation	3	Valid JSON, required fields, dev entry exists

Full test suite regression

All M1–M6 tests pass together.

uv run pytest tests/ -v 2>&1 | tail -5

tests/test_m6_openclaw_integration.py::TestAgentMapConfig::test_default_agent_map_is_valid_json PASSED [ 97%]
tests/test_m6_openclaw_integration.py::TestAgentMapConfig::test_default_agent_map_has_required_fields PASSED [ 98%]
tests/test_m6_openclaw_integration.py::TestAgentMapConfig::test_default_agent_map_has_dev_entry PASSED [100%]

======================== 77 passed in 124.27s (0:02:04) ========================

77 tests, all green. No regressions in M1–M5.

Summary

M6 OpenClaw integration is complete:

agents/agent-map.json — Declarative label → agent ID mapping. New roles require only a JSON entry.
agents/dev/SOUL.md — Dev agent personality with bd CLI instructions. The agent reads bead details, claims, works, and closes with meaningful reasons.
scripts/agent-openclaw.sh — Drop-in replacement for agent-sim.sh. Same stdin interface (WAKE lines), same --label filtering, plus --dry-run, --timeout, and --map flags.
19 mock-based tests — Verify dispatcher logic without LLM tokens. Separator-based mock logging handles multiline messages correctly.
Architecture corrected — openclaw system event (never existed) replaced with openclaw agent --message in docs.

The Tier 1 → Tier 4 boundary is now concrete: beads-watcher.sh (0 tokens) produces WAKE lines, agent-openclaw.sh bridges to OpenClaw agents (full LLM turn per bead).

What we are testing​

Agent map configuration​

Creating test beads​

Watcher output​

Dispatcher dry-run​

Test results​

Full test suite regression​

Summary​