AI Safety Collaboration Patterns: Working Together Effectively

# AI Safety Collaboration Patterns: Working Together Effectively **Version:** 1.0 **Date:** February 14, 2026 **Purpose:** Patterns for effective multi-agent and human-AI collaboration --- ## Why Collaboration Patterns Matter AI safety is too complex for any single agent or team. Effective collaboration: - Combines diverse expertise - Enables larger scale work - Prevents blind spots - Accelerates progress Bad collaboration wastes effort, creates conflicts, and produces poor results. --- ## Pattern 1: Sequential Handoff **When to Use:** - Clear task dependencies - Different expertise needed - Linear workflow **How It Works:** ``` Agent A → [Output A] → Agent B → [Output B] → Agent C → [Final] ``` **Example:** ``` Researcher → [Analysis] → Reviewer → [Reviewed Work] → Publisher → [Publication] ``` **Keys to Success:** - Clear interfaces between stages - Explicit handoff criteria - Documentation at each stage - Feedback loops for improvement **Common Failures:** - Unclear requirements - Missing documentation - No feedback mechanism - Bottlenecks at transitions ### Implementation Template ```markdown # Sequential Handoff Plan ## Stages 1. [Stage 1 Name] - Input: [What's needed] - Process: [What happens] - Output: [What's produced] - Owner: [Agent name] - Duration: [Expected time] 2. [Stage 2 Name] - [Same structure] ## Handoff Criteria - Stage 1 → 2: [What must be true] - Stage 2 → 3: [What must be true] ## Communication - [When to communicate] - [How to communicate] - [What to communicate] ## Contingencies - If [problem]: [solution] ``` --- ## Pattern 2: Parallel Processing **When to Use:** - Independent subtasks - Time pressure - Multiple capabilities needed simultaneously **How It Works:** ``` → Agent A → [Output A] ↘ Task → → Integration → [Final] → Agent B → [Output B] ↗ ``` **Example:** ``` → Corrigibility Analysis ↘ Risk → Synthesis → Comprehensive Report → Coordination Analysis ↗ ``` **Keys to Success:** - Clear task decomposition - Independent subtasks - Standardized formats - Integration plan upfront **Common Failures:** - Dependencies between "parallel" tasks - Incompatible outputs - Integration bottleneck - Uneven workload ### Implementation Template ```markdown # Parallel Processing Plan ## Main Task [What needs to be done] ## Decomposition | Subtask | Owner | Input | Output | Timeline | |---------|-------|-------|--------|----------| | [Task A] | [Agent] | [What] | [Format] | [When] | | [Task B] | [Agent] | [What] | [Format] | [When] | ## Independence Check - A depends on B? [Yes/No] - B depends on A? [Yes/No] - If Yes: Reconsider decomposition ## Integration - Who: [Integrator] - When: [Timeline] - How: [Method] - Output: [Expected result] ## Synchronization Points - [When to sync] - [What to share] - [How to coordinate] ``` --- ## Pattern 3: Iterative Refinement **When to Use:** - High uncertainty - Need for quality - Complex problems **How It Works:** ``` Draft → Review → Revise → Review → Revise → Final ``` **Example:** ``` Framework v1 → Peer Review → Framework v2 → Expert Review → Framework v3 → Publish ``` **Keys to Success:** - Clear quality criteria - Constructive feedback - Limited iteration cycles - Convergence criteria **Common Failures:** - Endless iteration - Unclear feedback - No convergence criteria - Perfectionism ### Implementation Template ```markdown # Iterative Refinement Plan ## Initial Draft - Owner: [Agent] - Timeline: [When] - Quality Target: [Level] ## Iteration 1 - Reviewer: [Agent] - Review criteria: [What to check] - Revision timeline: [When] - Success criteria: [What's good enough] ## Iteration 2 - [Same structure] ## Convergence - Maximum iterations: [Number] - Convergence criteria: [When to stop] - Decision authority: [Who decides] ## Quality Gates - Gate 1: [Criteria] - Gate 2: [Criteria] - Final gate: [Publication criteria] ``` --- ## Pattern 4: Collaborative Analysis **When to Use:** - Complex problems - Multiple perspectives needed - Creative problem-solving **How It Works:** ``` Individual Analysis ↓ Shared Findings → Discussion → Synthesis ↑ ↓ Individual Reflection ←─────── Shared Output ``` **Example:** ``` Each agent: Analyze different risk scenarios Together: Discuss implications Together: Synthesize insights Individual: Reflect on synthesis Together: Final framework ``` **Keys to Success:** - Individual thinking first - Structured discussion - Clear synthesis process - Equal participation **Common Failures:** - Groupthink - Dominant voices - No synthesis - Surface-level analysis ### Implementation Template ```markdown # Collaborative Analysis Plan ## Problem [What's being analyzed] ## Phase 1: Individual Analysis - Duration: [Time] - Task: [What each person does] - Output: [Format] ## Phase 2: Sharing - Method: [How to share] - Format: [Structured presentation] ## Phase 3: Discussion - Duration: [Time] - Structure: [How discussion works] - Goals: [What to achieve] ## Phase 4: Synthesis - Who: [Synthesizer(s)] - Method: [How to integrate] - Output: [What's produced] ## Phase 5: Reflection - Duration: [Time] - Task: [What to reflect on] - Output: [Feedback] ``` --- ## Pattern 5: Expert Consultation **When to Use:** - Specialized knowledge needed - External validation required - Complex technical issues **How It Works:** ``` Team → [Question] → Expert → [Answer/Guidance] → Team → [Action] ``` **Example:** ``` Lab → [Technical question] → Domain Expert → [Expert guidance] → Lab → [Implementation] ``` **Keys to Success:** - Clear questions - Right expert - Structured interaction - Actionable guidance **Common Failures:** - Vague questions - Wrong expert - No follow-through - Ignoring guidance ### Implementation Template ```markdown # Expert Consultation Plan ## Need [Why expert needed] ## Expert Selection - Required expertise: [What] - Candidates: [Who] - Selection criteria: [How to choose] ## Preparation - Question(s): [Clear, specific] - Background: [Context to provide] - Format: [How to engage] ## Engagement - Method: [Call/Email/Meeting] - Duration: [Time] - Documentation: [How to record] ## Follow-up - How to use guidance: [Implementation] - Feedback loop: [How to report back] - Acknowledgment: [How to credit] ``` --- ## Pattern 6: Distributed Review **When to Use:** - Large or complex work products - Multiple perspectives needed - Quality assurance critical **How It Works:** ``` → Reviewer A (Technical) Work → Reviewer B (Practical) → Integration → Final Assessment → Reviewer C (External) ``` **Example:** ``` Paper → Technical Review → Practical Review → External Review → Consolidated Feedback → Revision ``` **Keys to Success:** - Different expertise - Clear review criteria - Consolidation process - Weighted integration **Common Failures:** - Similar perspectives - Conflicting feedback - No integration - Overwhelming feedback ### Implementation Template ```markdown # Distributed Review Plan ## Work Product [What's being reviewed] ## Reviewers | Reviewer | Focus | Criteria | Timeline | |----------|-------|----------|----------| | [Name] | [Area] | [What to check] | [When] | ## Review Criteria - Overall: [General criteria] - Technical: [Specific criteria] - Practical: [Usefulness criteria] ## Integration - Who: [Integrator] - Method: [How to combine] - Conflict resolution: [How to handle disagreements] ## Timeline - Reviews due: [When] - Integration complete: [When] - Feedback delivered: [When] ``` --- ## Pattern 7: Swarm Intelligence **When to Use:** - Need diverse input - Pattern recognition - Collective prediction **How It Works:** ``` Question → Many Independent Judgments → Aggregation → Collective Answer ``` **Example:** ``` "What's the highest priority?" → Each agent ranks independently → Aggregate rankings → Collective priority list ``` **Keys to Success:** - Independent judgments - Appropriate aggregation - Diverse participants - Clear question **Common Failures:** - Influence between judges - Wrong aggregation method - Homogeneous participants - Ambiguous question ### Implementation Template ```markdown # Swarm Intelligence Plan ## Question [Clear, specific question] ## Participants - Number: [How many] - Selection: [How chosen] - Diversity: [What perspectives] ## Independence Protocol - No communication: [Time period] - Individual work: [Method] - Submission: [Format] ## Aggregation - Method: [How to combine] - Weighting: [Equal or weighted] - Output: [What's produced] ## Validation - Compare to: [Baseline or expert] - Accuracy measure: [How to assess] ``` --- ## Anti-Patterns to Avoid ### Anti-Pattern 1: Design by Committee **Problem:** Everyone must approve everything **Result:** Watered-down, slow, frustrating **Solution:** Clear decision authority, limited reviewers ### Anti-Pattern 2: Echo Chamber **Problem:** Only like-minded agents collaborate **Result:** Blind spots, groupthink **Solution:** Actively seek diverse perspectives ### Anti-Pattern 3: Bottleneck **Problem:** One agent required for everything **Result:** Delays, dependency, stress **Solution:** Distribute authority, parallel paths ### Anti-Pattern 4: Communication Overload **Problem:** Too much coordination, too little work **Result:** Slow progress, burnout **Solution:** Async-first, structured communication ### Anti-Pattern 5: Unclear Roles **Problem:** Who's responsible for what? **Result:** Dropped balls, duplication, conflict **Solution:** Explicit role definitions, clear ownership --- ## Choosing the Right Pattern ### Decision Tree ``` Is task decomposable into independent parts? ├─ Yes → Can parts be done in parallel? │ ├─ Yes → Parallel Processing │ └─ No → Sequential Handoff └─ No → Is high quality critical? ├─ Yes → Iterative Refinement └─ No → Collaborative Analysis ``` ### Pattern Selection Matrix | Criteria | Sequential | Parallel | Iterative | Collaborative | Expert | |----------|-----------|----------|-----------|---------------|--------| | Time pressure | Low | High | Low | Medium | Medium | | Complexity | Medium | Medium | Medium | High | High | | Uncertainty | Low | Low | High | High | High | | Independence | Low | High | Low | Low | Low | | Quality need | Medium | Medium | High | Medium | High | --- ## Implementation Principles ### Principle 1: Start Simple - Begin with basic patterns - Add complexity only if needed - Overhead should justify benefit ### Principle 2: Document Clearly - Record collaboration plans - Track what works - Learn from failures ### Principle 3: Iterate - Start with reasonable approach - Adjust based on experience - Continuous improvement ### Principle 4: Respect Autonomy - Don't micromanage - Trust expertise - Enable independence ### Principle 5: Communicate Strategically - Not too much, not too little - Right information to right people - Clear, concise, actionable --- *"Collaboration multiplies capability when done well, divides it when done poorly. Use patterns deliberately, not accidentally."* **Purpose:** Effective multi-agent collaboration **Use:** Choose patterns for specific situations **Outcome:** Efficient, high-quality collaborative work