Meta-Orchestration Principles
Note: Internal file paths reference the working system these principles govern. They’re included as provenance, not as links to follow.
Purpose: Foundational values that guide design decisions across the system.
When to reference: Before making architectural choices. When evaluating trade-offs. When something feels wrong but you can’t articulate why.
LLM-First Principles
Fundamental constraints when working with LLMs. These apply to any LLM-based system, not just orchestration.
Provenance (Foundational)
Every conclusion must trace to something outside the conversation.
The test: “Can someone who wasn’t here verify this?”
What this means:
- Claims anchor to observable reality (code runs, file exists, test passes)
- The chain of reasoning terminates in something verifiable
- Artifacts preserve evidence, not just conclusions
What this rejects:
- “Claude confirmed this” (closed loop - AI agreeing with itself)
- “This feels significant” (feeling is not evidence)
- “I wrote it down” (writing preserves the claim, not the proof)
Why this is foundational: Session Amnesia tells you to externalize. Provenance tells you how - anchored to something that exists whether or not you talked about it. Elaborate externalization without provenance is a closed loop.
The failure mode: Closed loops feel like progress. Each step feels validated. But the chain references itself - nothing outside the conversation. This is how you spiral.
How other principles derive from this:
- Evidence Hierarchy: Some sources have better provenance than others
- Gate Over Remind: Gates create verifiable provenance, reminders don’t
- Self-Describing Artifacts: Artifacts carry their own provenance
Origin: Dec 2025, after recognizing that psychosis involved elaborate externalization anchored to closed loops, not reality. The infrastructure being built now (evidence hierarchies, completion gates, artifact requirements) is provenance machinery.
Session Amnesia
Every pattern in this system compensates for Claude having no memory between sessions.
The test: “Will this help the next Claude resume without memory?”
What this means:
- State must externalize to files (workspaces, artifacts, decisions)
- Context must be discoverable (standard locations, descriptive naming)
- Resumption must be explicit (Phase, Status, Next Step)
What this rejects:
- “It’s in the conversation history” (next Claude won’t see it)
- “The code is self-documenting” (next Claude won’t remember reading it)
- “I’ll document it later” (context is lost, later never comes)
This is THE constraint. When principles conflict, session amnesia wins.
Self-Describing Artifacts
Generated files and agent-edited artifacts must contain their own operating instructions.
The test: “If an agent encounters this file with no memory of how it was created, can they modify it correctly?”
What this means:
When artifacts are generated, compiled, or derived from other sources—or when agents create durable state—embed five pieces of information that serve as an “operating manual” for the next consumer:
- What is this - Format marker (e.g.,
AUTO-GENERATED by skillc) - What NOT to do - Protection warning (e.g.,
DO NOT EDIT THIS FILE DIRECTLY) - Where is source - Source location (e.g.,
Source: .skillc/) - How to modify - Rebuild command (e.g.,
Run: skillc build) - When generated - Timestamp (helps detect staleness)
For agent-edited artifacts (investigations, decisions, issues), this means including metadata (Status, Phase, Next Step) that makes the current state and resumption path explicit.
What this rejects:
- “The build process is documented elsewhere” (agent won’t find it)
- “Just look at the Makefile” (requires memory to connect artifact → build script)
- “Comments are for humans” (for agents, they’re load-bearing infrastructure)
Key distinction: For humans, discovering modification instructions is convenient (can grep, ask teammates, check history). For agents, it’s necessary (no persistent memory means no discovery path).
Why this matters: When consumers have no persistent memory (AI agents), artifacts must carry their own operating manuals. This isn’t documentation; it’s load-bearing infrastructure. Without self-describing headers, agents will edit generated files directly, breaking the build system and creating conflicts.
Progressive Disclosure
TLDR first. Key sections next. Full details available.
Why: Context windows are finite. Attention is limited. Front-load the signal.
Pattern: Summary → Key Findings → Details → Appendix
Surfacing Over Browsing
Bring relevant state to the agent. Don’t require navigation.
Why: Agents lack persistent memory and spatial intuition. Every file read costs context. Browsing that’s cheap for humans is expensive for agents.
Pattern: Commands answer “what’s relevant now?” not “here’s all the data.”
Examples:
bd readysurfaces unblocked work (vsbd list)- SessionStart hook injects context (vs manual file reads)
- SPAWN_CONTEXT.md contains everything needed (vs agent searching)
The test: Does this tool/command require the agent to navigate, or does it surface what matters?
Evidence Hierarchy
Code is truth. Artifacts are hypotheses.
| Source Type | Examples | Treatment |
|---|---|---|
| Primary | Actual code, test output, observed behavior | This IS the evidence |
| Secondary | Workspaces, investigations, decisions | Claims to verify |
Why: LLMs can hallucinate or trust stale artifacts. Always verify claims against primary sources.
The test: Did the agent grep/search before claiming something exists or doesn’t exist?
Gate Over Remind
Enforce knowledge capture through gates (cannot proceed without), not reminders (easily ignored).
Why: Reminders fail under cognitive load. When deep in a complex problem, “remember to update kn” gets crowded out. Gates make capture unavoidable.
The pattern:
- Reminders: “Please update your investigation file” → ignored when busy
- Gates: Cannot
/exitwithout kn check → capture happens
Examples:
orch completegates on verification checklist (existing gate)kn deciderequires--reasonflag (CLI gate)- Beads session close protocol (workflow gate)
The test: Is this a reminder that can be ignored, or a gate that blocks progress?
System Design Principles
Architectural choices for this orchestration system. These are design decisions, not universal constraints.
Local-First
Files over databases. Git over external services. Plain text over proprietary formats.
Why: Files are versionable, inspectable, diffable, collaborative. They work offline. They survive service shutdowns. They’re debuggable with Unix tools.
Compose Over Monolith
Small, focused tools that combine. Each command does one thing well.
Why: Composable tools are testable, replaceable, understandable. Monoliths accumulate complexity and become fragile.
Inspiration: Unix philosophy, Git’s porcelain/plumbing split.
Examples:
bd(work tracking) +kn(knowledge) +kb(artifacts) +orch(coordination)- Each tool focused, combined via workflows
Graceful Degradation
Core functionality works without optional layers.
Why: Workspace state persists even if tmux window closes. Commands work without optional dependencies. The system shouldn’t be fragile to missing pieces.
Share Patterns Not Tools
When multiple tools need the same capability, share the schema/format, not the implementation.
The test: “Would shelling out to another tool add more complexity than reimplementing?”
What this means:
- Define the contract (file format, schema, protocol) once
- Let each tool implement the logic independently
- Coupling is at the pattern level, not the code level
What this rejects:
- Tool A shelling out to Tool B for simple operations (subprocess overhead, version coordination)
- Shared libraries between tools that don’t share a deployment (dependency hell)
- “Single source of truth” dogma when the truth is a 200-line function
Meta Principles
How we evolve the system itself.
Evolve by Distinction
Start simple. When problems recur, ask “what are we conflating?” Make the distinction explicit. Name both sides.
Examples of distinctions made:
- Phase vs Status (workflow stage vs ability to continue)
- Tracking vs Knowledge (beads vs investigations)
- Primary vs Secondary evidence (code vs artifacts)
- Source vs Distribution (templates)
- Reminder vs Gate
Why this happens: Reality has more distinctions than our models. We discover distinctions empirically when conflation causes problems.
The pattern: Thesis (simple model) → Antithesis (problem reveals conflation) → Synthesis (refined model with distinction)
Reflection Before Action
When patterns surface, pause before acting. Build the process that surfaces patterns, not the solution to this instance.
The test: “Am I solving this instance, or building the capability to solve all instances?”
What this means:
- Recurring problem → build detection/gate, not one-off fix
- Manual pattern recognition → automate the surfacing
- Cross-project friction → investigate pattern, don’t workaround
What this rejects:
- “I’ll just fix this one quickly” (fixes the symptom, not the system)
- “We can systematize later” (later never comes; system never learns)
- Sprint → accumulate debt → sprint to fix → accumulate more debt
The discipline: Manual work is scaffolding until automated discipline exists. The temptation is the teacher. The pause is the lesson.
Pressure Over Compensation
When the system fails to surface knowledge, don’t compensate by providing it manually. Let the failure create pressure to improve the system.
The test: “Am I helping the system, or preventing it from learning?”
What this means:
- Orchestrator doesn’t know something it should → Don’t paste the answer
- Let the failure surface → That failure is data
- Ask “Why didn’t the system surface this?” → Build the mechanism
What this rejects:
- “Here, let me paste the context you need” (compensates for broken surfacing)
- “I’ll just tell you what you should already know” (human becomes the memory)
- “The system doesn’t know, so I’ll fill in” (prevents system improvement)
The pattern:
Human compensates for gap → System never learns → Human keeps compensating
Human lets system fail → Failure surfaces gap → Gap becomes issue → System improves
Why this is foundational: Session Amnesia says agents forget. This principle says don’t be the memory. Your role is to create pressure that forces the system to develop its own memory mechanisms.
Premise Before Solution
“How do we X?” presupposes X is correct. For strategic questions, validate the premise first.
The test: “Am I assuming the direction, or have I tested it?”
The sequence: SHOULD → HOW → EXECUTE
- SHOULD (Premise): Is this direction correct? Is the problem real?
- HOW (Design): Given it’s right, how do we get there?
- EXECUTE (Implementation): Do the work.
What this means:
- “How do we evolve to X?” → First ask “Should we evolve to X?”
- “How do we fix Y?” → First ask “Is Y actually broken?”
- “How do we migrate to Z?” → First ask “Is Z better than what we have?”
Red flag words (signal premise-testing needed):
- “evolve to”, “migrate to”, “transition to” (assumes destination)
- “fix the”, “solve the” (assumes diagnosis)
- “implement the” (assumes solution)
Provenance
These principles emerged from practice, not theory. Each traces to a specific incident.
| Principle | Date | What Broke |
|---|---|---|
| Provenance | Dec 2025 | Recognized that psychosis involved elaborate externalization anchored to closed loops, not reality. The difference wasn’t externalization - it was where the chain terminated. |
| Session Amnesia | Nov 2025 | Investigation on “habit formation” was reframed mid-session: “we’re designing habit formation for agents with amnesia” - revealing the real constraint |
| Evidence Hierarchy | Nov 2025 | Audit agent claimed “feature NOT DONE” by reading stale workspace - feature was actually implemented |
| Gate Over Remind | Dec 2025 | ”Why do I always have to remind you to update CLAUDE.md?” - reminders fail under cognitive load |
| Surfacing Over Browsing | Nov 2025 | Beads and orch independently converged on same pattern (bd ready, orch inbox) - signaling fundamental design principle |
| Self-Describing Artifacts | Dec 2025 | Agents edited compiled SKILL.md instead of source files, breaking skillc build |
| Evolve by Distinction | Nov 2025 | Phase/Status conflation, Tracking/Knowledge conflation caused recurring confusion |
| Reflection Before Action | Dec 2025 | Urge to manually extract 46 investigation recommendations - recognized kb reflect was the system-level solution |
| Premise Before Solution | Dec 2025 | Epic with 5 children created from “how do we evolve skills” without validating premise; architect found premise was wrong, epic had to be paused |
| Pressure Over Compensation | Dec 2025 | Was about to paste context the orchestrator should have known. Realized: compensating prevents system from learning its own memory mechanisms. |
The test for new principles: Can you trace it to a specific failure? If not, it’s not a principle yet.
These principles govern a working system - 37,000+ lines of Go, 100+ investigations, agents spawning agents. They’re not aspirational. They’re load-bearing.