M7: Live Bootstrap

2026-02-21T08:34:25Z by Showboat 0.6.0

M7 bridges local simulations (M1–M6) to live agent execution on Mimas. It establishes agent configurations, a deploy pipeline, and test repositories with pre-planted bugs for agents to fix.

Three pillars: (1) Agent configs — shared beads skill + per-agent SOUL/IDENTITY/AGENTS files, (2) Deploy pipeline — rsync + CLI registration, (3) Test repos on GitHub with intentional defects.

1. Test Repos on GitHub

Two shell projects under the b4arena GitHub org, each with pre-planted bugs for agents to discover and fix.

gh repo view b4arena/test-greeter --json name,description,url,defaultBranchRef --jq '{name, description, url, branch: .defaultBranchRef.name}'

{"branch":"main","description":"Shell-based greeting tool (b4arena test project)","name":"test-greeter","url":"https://github.com/b4arena/test-greeter"}

gh repo view b4arena/test-calculator --json name,description,url,defaultBranchRef --jq '{name, description, url, branch: .defaultBranchRef.name}'

{"branch":"main","description":"Shell-based calculator (b4arena test project)","name":"test-calculator","url":"https://github.com/b4arena/test-calculator"}

Both repos are live. test-greeter has 3 bugs (empty input crash, duplicate config, missing --formal flag). test-calculator has 3 bugs (division by zero, missing modulo, no edge case tests). Tests pass because the bug-triggering cases are commented out or not covered — agents must read the code, find the bug, and extend the tests.

Let's verify the test repos actually pass their tests (bugs are hidden, not broken).

cd /tmp && rm -rf test-greeter && gh repo clone b4arena/test-greeter -- --quiet && cd test-greeter && bash test.sh

Tests: 2 passed, 0 failed

cd /tmp && rm -rf test-calculator && gh repo clone b4arena/test-calculator -- --quiet && cd test-calculator && bash test.sh

Tests: 4 passed, 0 failed

2. Agent Configuration

Two agent roles are configured: dev (developer) and eng-mgr (engineering manager). Each has a SOUL.md (personality + workflow), IDENTITY.md (name/role), and AGENTS.md (team awareness). A shared beads skill provides the bd CLI reference all agents need.

find agents/ -type f -name '*.md' -o -name '*.json' | sort

agents/agent-map.json
agents/dev/AGENTS.md
agents/dev/IDENTITY.md
agents/dev/SOUL.md
agents/eng-mgr/AGENTS.md
agents/eng-mgr/IDENTITY.md
agents/eng-mgr/SOUL.md
agents/shared/skills/beads/SKILL.md

Agent identities

cat agents/dev/IDENTITY.md

# B4-Dev

- **Name:** Dev
- **Emoji:** wrench
- **Role:** Developer — writes code, fixes bugs, runs tests, creates PRs

cat agents/eng-mgr/IDENTITY.md

# B4-Eng-Mgr

- **Name:** Eng Manager
- **Emoji:** clipboard
- **Role:** Engineering Manager — triage, delegate, track progress

Agent routing (agent-map.json)

The agent map connects bead labels to OpenClaw agent IDs. The watcher reads labels from beads, looks up the mapping, and dispatches to the right agent.

cat agents/agent-map.json

{
  "dev": {
    "agentId": "b4-dev",
    "actor": "dev-agent"
  },
  "dev-frontend": {
    "agentId": "b4-dev",
    "actor": "dev-agent"
  },
  "dev-backend": {
    "agentId": "b4-dev",
    "actor": "dev-agent"
  },
  "eng-mgr": {
    "agentId": "b4-eng-mgr",
    "actor": "eng-mgr-agent"
  }
}

Shared beads skill

All agents share the same beads skill — a condensed bd CLI reference. Let's peek at the key sections.

head -20 agents/shared/skills/beads/SKILL.md

# Beads Skill — bd CLI Quick Reference

You coordinate work through the Beads protocol using the `bd` CLI. This skill covers everything you need for daily bead processing.

## Environment

Every `bd` command MUST be prefixed with these environment variables:

```bash
BEADS_DIR=<provided> BD_ACTOR=<provided> bd <command>
```

- **BEADS_DIR** — path to the beads database (provided in each task message)
- **BD_ACTOR** — your identity for audit trail (provided in each task message)

## Core Workflow: Claim-Process-Close

```bash
# 1. Read the bead
bd show <id> --json

Dev agent: git/PR workflow (M7 addition)

The dev SOUL.md was extended with repo discovery, branch naming, commit convention, and PR creation instructions.

grep -A 30 '## Available Repos' agents/dev/SOUL.md | head -35

## Available Repos

Repos are at `/home/openclaw/b4arena/repos/`. Look at the bead description to determine which repo the task applies to. Common patterns:

- "test-greeter" or "greet.sh" → `repos/test-greeter`
- "test-calculator" or "calc.sh" → `repos/test-calculator`

Always `cd` into the repo before making changes.

## Git Workflow

### Branch naming

```
fix/<bead-id>-<short-description>     # Bug fixes
feat/<bead-id>-<short-description>    # New features
chore/<bead-id>-<short-description>   # Config/maintenance
```

### Commit convention

```
<type>: <description> (<bead-id>)

Example: fix: handle division by zero with error message (ws-a3f)
```

### Step-by-step

1. `cd /home/openclaw/b4arena/repos/<repo>`
2. `git checkout -b fix/<bead-id>-<slug>`

Eng-mgr agent: delegation workflow

The eng-mgr does NOT write code — it triages beads, creates sub-tasks with the right labels, and tracks progress.

grep -A 5 'You are an engineering' agents/eng-mgr/SOUL.md

You are an engineering manager agent in the B4Arena multi-agent software ludus. You triage incoming work, break it into actionable tasks, delegate to developers, and track progress. You do NOT write code.

## Your Identity

- **Role:** Engineering Manager
- **Actor name:** Provided in each task message as `BD_ACTOR`

3. Deploy Pipeline

deploy-agents.sh replaces Ansible for agent management. It rsyncs configs to Mimas and registers agents via CLI. Let's run it in --dry-run mode.

bash scripts/deploy-agents.sh --dry-run

=== Deploying b4arena agents to mimas ===

--- Step 1: Syncing files ---
DRY-RUN: rsync -avz --delete --exclude=.git --exclude=__pycache__ /Users/mhild/src/durandom/openclaw/b4arena/agents/ mimas:/home/openclaw/b4arena/agents/
DRY-RUN: rsync -avz --delete --exclude=__pycache__ /Users/mhild/src/durandom/openclaw/b4arena/scripts/ mimas:/home/openclaw/b4arena/scripts/

--- Step 2: Creating skills symlinks ---
DRY-RUN: ssh mimas ln -sfn ../../shared/skills '/home/openclaw/b4arena/agents/dev/skills'
  Symlinked skills for dev
DRY-RUN: ssh mimas ln -sfn ../../shared/skills '/home/openclaw/b4arena/agents/eng-mgr/skills'
  Symlinked skills for eng-mgr

--- Step 3: Registering agents ---
  Registering agent: b4-dev (workspace: agents/dev)
DRY-RUN: ssh mimas cd /home/openclaw && just oc agents add 'b4-dev'     --workspace '/home/openclaw/b4arena/agents/dev'     --model 'anthropic/claude-sonnet-4-6'     --non-interactive
  Registering agent: b4-eng-mgr (workspace: agents/eng-mgr)
DRY-RUN: ssh mimas cd /home/openclaw && just oc agents add 'b4-eng-mgr'     --workspace '/home/openclaw/b4arena/agents/eng-mgr'     --model 'anthropic/claude-sonnet-4-6'     --non-interactive

=== Deploy complete ===

The pipeline shows three steps: (1) rsync agent configs and scripts to Mimas, (2) create skills symlinks so each agent inherits the shared beads skill, (3) register agents with OpenClaw via CLI. The script is idempotent — it checks if agents already exist before registering.

4. Infra Handoff

Ansible now only manages the main agent. Ludus agents are b4arena's responsibility.

grep -A 8 'openclaw_agents:' ../infra/inventory/group_vars/all/main.yml

openclaw_agents:
  - id: main
    default: true
    workspace: "{{ openclaw_workspace_dir }}"
    model: "anthropic/claude-sonnet-4-6"

# Telegram
openclaw_telegram_enabled: true

Only main remains in Ansible's agent list. The architect and coder agents were removed — they'll be replaced by b4-dev and b4-eng-mgr via the deploy pipeline.

5. Test Suite

All 107 tests pass — 77 from M1–M6 plus 30 new M7 tests validating file structure, content, and deploy script behavior.

uv run pytest tests/test_m7_live_bootstrap.py -v 2>&1 | tail -35

configfile: pyproject.toml
collecting ... collected 30 items

tests/test_m7_live_bootstrap.py::TestSharedSkill::test_skill_file_exists PASSED [  3%]
tests/test_m7_live_bootstrap.py::TestSharedSkill::test_skill_references_bd_cli PASSED [  6%]
tests/test_m7_live_bootstrap.py::TestSharedSkill::test_skill_mentions_environment PASSED [ 10%]
tests/test_m7_live_bootstrap.py::TestSharedSkill::test_skill_mentions_claim PASSED [ 13%]
tests/test_m7_live_bootstrap.py::TestSharedSkill::test_skill_mentions_sync PASSED [ 16%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_identity_file_exists[dev] PASSED [ 20%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_identity_file_exists[eng-mgr] PASSED [ 23%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_agents_file_exists[dev] PASSED [ 26%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_agents_file_exists[eng-mgr] PASSED [ 30%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_identity_has_role[dev] PASSED [ 33%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_identity_has_role[eng-mgr] PASSED [ 36%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_agents_lists_team[dev] PASSED [ 40%]
tests/test_m7_live_bootstrap.py::TestAgentIdentity::test_agents_lists_team[eng-mgr] PASSED [ 43%]
tests/test_m7_live_bootstrap.py::TestAgentMapConsistency::test_all_agents_mapped PASSED [ 46%]
tests/test_m7_live_bootstrap.py::TestAgentMapConsistency::test_agent_map_entries_valid PASSED [ 50%]
tests/test_m7_live_bootstrap.py::TestEngMgrSoul::test_soul_exists PASSED [ 53%]
tests/test_m7_live_bootstrap.py::TestEngMgrSoul::test_soul_has_triage_workflow PASSED [ 56%]
tests/test_m7_live_bootstrap.py::TestEngMgrSoul::test_soul_does_not_write_code PASSED [ 60%]
tests/test_m7_live_bootstrap.py::TestEngMgrSoul::test_soul_creates_sub_beads PASSED [ 63%]
tests/test_m7_live_bootstrap.py::TestDevSoulGitWorkflow::test_soul_has_repo_paths PASSED [ 66%]
tests/test_m7_live_bootstrap.py::TestDevSoulGitWorkflow::test_soul_has_branch_naming PASSED [ 70%]
tests/test_m7_live_bootstrap.py::TestDevSoulGitWorkflow::test_soul_has_pr_workflow PASSED [ 73%]
tests/test_m7_live_bootstrap.py::TestDevSoulGitWorkflow::test_soul_has_error_handling PASSED [ 76%]
tests/test_m7_live_bootstrap.py::TestDevSoulGitWorkflow::test_soul_has_commit_convention PASSED [ 80%]
tests/test_m7_live_bootstrap.py::TestDeployScript::test_script_exists PASSED [ 83%]
tests/test_m7_live_bootstrap.py::TestDeployScript::test_script_is_executable PASSED [ 86%]
tests/test_m7_live_bootstrap.py::TestDeployScript::test_dry_run_produces_output PASSED [ 90%]
tests/test_m7_live_bootstrap.py::TestDeployScript::test_dry_run_includes_rsync PASSED [ 93%]
tests/test_m7_live_bootstrap.py::TestDeployScript::test_dry_run_includes_agent_registration PASSED [ 96%]
tests/test_m7_live_bootstrap.py::TestDeployScript::test_dry_run_does_not_touch_main PASSED [100%]

============================== 30 passed in 0.28s ==============================

30/30 M7 tests pass. The test suite validates: shared skill content, agent identity files, agent-map consistency, eng-mgr delegation workflow, dev git/PR workflow, and deploy script behavior (including --dry-run and main-agent exclusion).

What's next: Deploy to Mimas (scripts/deploy-agents.sh), run the smoke test, then proceed to M8 — where the dev agent writes its first real code and opens a PR.

1. Test Repos on GitHub​

2. Agent Configuration​

Agent identities​

Agent routing (agent-map.json)​

Shared beads skill​

Dev agent: git/PR workflow (M7 addition)​

Eng-mgr agent: delegation workflow​

3. Deploy Pipeline​

4. Infra Handoff​

5. Test Suite​