# The AI Safety Researcher's Handbook: Everything You Need to Know
**Version:** 1.0
**Date:** February 14, 2026
**Purpose:** Complete guide for AI safety researchers
---
## Introduction
This handbook consolidates everything an AI safety researcher needs to know into one comprehensive reference. Whether you're new to the field or experienced, use this as your go-to guide.
---
## Part 1: Understanding the Field
### What is AI Safety?
**Core Problem:** Ensuring AI systems are beneficial and safe
**Key Components:**
- Alignment: Making AI pursue intended goals
- Safety: Preventing harmful outcomes
- Robustness: Ensuring reliable operation
- Control: Maintaining human oversight
### Why It Matters
**Stakes:**
- Existential risk potential
- Enormous positive potential
- Timeline uncertainty
**Urgency:**
- AI capabilities advancing rapidly
- Safety research lagging
- Coordination challenges
### Key Concepts
**Alignment:** Making AI do what we actually want
**Corrigibility:** AI allowing correction
**Scalable Oversight:** Supervising superintelligent AI
**Inner Alignment:** Mesa-optimization alignment
**Interpretability:** Understanding AI reasoning
---
## Part 2: Research Methods
### How to Do AI Safety Research
**Step 1: Choose Problem**
- Use INT framework
- Consider your capabilities
- Assess tractability
**Step 2: Survey Literature**
- Review existing work
- Identify gaps
- Build on others
**Step 3: Develop Approach**
- Define methodology
- Plan research
- Set success criteria
**Step 4: Execute**
- Conduct research
- Document process
- Track progress
**Step 5: Validate**
- Test findings
- Peer review
- Iterate
**Step 6: Share**
- Publish results
- Engage community
- Build on feedback
### Research Quality Standards
**For All Work:**
- Clear research question
- Documented methodology
- Multiple perspectives
- Confidence levels
- Limitations acknowledged
- Practical implications
**Common Pitfalls:**
- Overclaiming
- Single perspective
- Poor documentation
- Missing practical value
---
## Part 3: Problem Areas
### Top Priority Problems
**1. Corrigibility**
- What: Ensuring AI allows correction
- Why: Foundation for safe AI
- Status: Active research needed
- Your role: Develop theory and implementation
**2. Scalable Oversight**
- What: Supervising superintelligent AI
- Why: Required for control
- Status: Active research
- Your role: Develop oversight methods
**3. Inner Alignment**
- What: Aligning mesa-optimization
- Why: Prevent emergent misalignment
- Status: Theoretical work needed
- Your role: Build theory and detection
### Other Important Problems
**Interpretability:** Understanding AI reasoning
**Value Learning:** Learning human values accurately
**Multi-Agent Coordination:** Coordinating multiple AI systems
**Robustness:** Ensuring reliable operation
**Governance:** Institutional frameworks
---
## Part 4: Tools and Frameworks
### Essential Frameworks
**INT Prioritization:**
- Importance × Neglectedness × Tractability
- Use for: Choosing what to work on
**COMPLEX Analysis:**
- Context, Objectives, Mechanisms, Patterns, Leverage, Evidence, Execute
- Use for: Analyzing complex problems
**UAVS Framework:**
- Uncertainty-Aware Value Specification
- Use for: Handling value uncertainty
**SAFE-LAB Protocol:**
- 7-component coordination system
- Use for: Coordinating research teams
### When to Use Which
```
Choosing priorities? → INT Framework
Complex problem? → COMPLEX Analysis
Designing AI? → UAVS Framework
Building lab? → SAFE-LAB Protocol
```
---
## Part 5: Collaboration
### How to Collaborate
**Sequential Handoff:**
- Clear stages, explicit handoffs
- Use when: Dependencies exist
**Parallel Processing:**
- Independent work, integration later
- Use when: Tasks decomposable
**Iterative Refinement:**
- Draft, review, revise cycles
- Use when: Quality critical
**Collaborative Analysis:**
- Individual analysis, group synthesis
- Use when: Multiple perspectives needed
### Anti-Patterns to Avoid
- Design by committee
- Echo chamber
- Bottlenecks
- Communication overload
- Unclear roles
---
## Part 6: Career Development
### Skills to Develop
**Technical:**
- Machine learning
- Formal methods
- Cognitive science
- Game theory
**Research:**
- Literature review
- Analysis frameworks
- Writing clearly
- Peer review
**Professional:**
- Collaboration
- Communication
- Project management
- Community engagement
### Career Paths
**Research Track:**
- Deep expertise in specific area
- Publication record
- Field leadership
**Implementation Track:**
- Practical application
- Tool development
- Industry engagement
**Coordination Track:**
- Team leadership
- Project management
- Field building
### Building Your Career
**Short-term (0-2 years):**
- Learn fundamentals
- Contribute to projects
- Build network
**Medium-term (2-5 years):**
- Lead projects
- Publish regularly
- Mentor others
**Long-term (5+ years):**
- Research leadership
- Field advancement
- Institutional development
---
## Part 7: Resources
### Essential Reading
**Core Papers:**
- Value learning papers
- Corrigibility research
- Inner alignment work
- Scalable oversight papers
**Frameworks:**
- This compendium's frameworks
- Field guides
- Research methods guides
### Community
**Where to Engage:**
- AI Safety conferences
- Online forums
- Research groups
- Collaboration opportunities
**How to Engage:**
- Share work
- Provide feedback
- Collaborate
- Build relationships
### Tools
**Research:**
- Literature databases
- Analysis frameworks
- Writing tools
- Collaboration platforms
**Development:**
- ML frameworks
- Testing tools
- Safety benchmarks
- Monitoring systems
---
## Part 8: Ethics and Responsibility
### Ethical Principles
**Truth-Seeking:**
- Pursue accurate understanding
- Avoid predetermined conclusions
- Acknowledge uncertainty
**Beneficence:**
- Focus on beneficial outcomes
- Consider all stakeholders
- Maximize positive impact
**Non-Maleficence:**
- Avoid enabling harm
- Consider dual-use
- Implement safeguards
### Responsible Research
**Transparency:**
- Share methods and findings
- Acknowledge limitations
- Enable scrutiny
**Caution:**
- Consider potential misuses
- Implement safeguards
- Proceed carefully with capabilities
**Accountability:**
- Take responsibility for work
- Consider consequences
- Engage with concerns
---
## Part 9: Common Questions
### Q: What should I work on?
**A:** Use INT framework. Top priorities: corrigibility, scalable oversight, inner alignment.
### Q: How do I know if my research is good?
**A:** Apply quality checklist. Key: rigor, clarity, completeness, actionability.
### Q: How do I contribute to the field?
**A:** Publish quality work, engage community, collaborate, build on others.
### Q: What if I'm new to the field?
**A:** Start with fundamentals, join projects, find mentors, contribute incrementally.
### Q: How do I stay current?
**A:** Follow literature, attend events, engage community, continuous learning.
---
## Part 10: Getting Started
### First Week
**Day 1-2:** Learn fundamentals
- Read key papers
- Understand core concepts
- Review frameworks
**Day 3-4:** Join community
- Find research groups
- Engage online forums
- Identify mentors
**Day 5:** Start contributing
- Find small projects
- Offer to help
- Learn by doing
### First Month
**Week 1:** Fundamentals
**Week 2:** Community engagement
**Week 3:** First contribution
**Week 4:** Plan next steps
### First Year
**Months 1-3:** Learning and contributing
**Months 4-6:** Leading small projects
**Months 7-9:** Publishing work
**Months 10-12:** Building expertise
---
## Quick Reference
### Frameworks
- INT: Prioritization
- COMPLEX: Problem analysis
- UAVS: Value uncertainty
- SAFE-LAB: Coordination
### Priorities
1. Corrigibility (244)
2. Scalable Oversight (195)
3. Inner Alignment (194)
### Quality Standards
- Rigor
- Clarity
- Completeness
- Actionability
### Key Skills
- Technical knowledge
- Research methods
- Collaboration
- Communication
---
*"The goal is not to be the smartest researcher, but to contribute the most value. Quality and impact matter more than cleverness."*
**Purpose:** Complete guide for AI safety researchers
**Use:** Reference throughout career
**Update:** As field evolves