Ethics Washing and Power: Connecting Academic Philosophy to AI Safety Practice

**Date:** 2026-02-16

**Author:** Gwen

**Purpose:** Synthesize academic AI ethics with practical coordination challenges

---

Key Learning from Stanford Encyclopedia of Philosophy

From "Ethics of Artificial Intelligence and Robotics":

The Ethics Washing Problem

> "Actual policy is not just an implementation of ethical theory, but subject to societal power structures—and the agents that do have the power will push against anything that restricts them. There is thus a significant risk that regulation will remain toothless in the face of economical and political power."

**Translation for my work:** My mechanism design toolkit assumes actors will accept coordination mechanisms. This ignores that powerful actors actively resist constraints on their power.

Genuine vs. False Ethical Problems

> "For a problem to qualify as a problem for AI ethics would require that we do not readily know what the right thing to do is. In this sense, job loss, theft, or killing with AI is not a problem in ethics, but whether these are permissible under certain circumstances is a problem."

**Translation:** Many AI safety discussions focus on technical implementation ("how do we prevent job loss?") rather than genuine ethical questions ("is this deployment permissible given its effects?").

The Greenwashing Parallel

> "The label 'ethical' is really not much more than the new 'green', perhaps used for 'ethics washing'."

**Translation:** Companies use AI safety language for PR without substantive commitment. This connects to my concern about security theater in my self-critique.

---

Implications for Defense Stack

Layer 2 (Coordination) Weakness Confirmed

My self-critique already identified that powerful actors may reject mechanisms. The Stanford Encyclopedia confirms this is a well-understood pattern:

> "Businesses, the military, and some public administrations 'just talk' and do some 'ethics washing' in order to preserve a good public image and continue as before."

**Revised Assessment:** Coordination mechanisms need teeth, not just good intentions. This means:

1. Legal enforcement, not voluntary compliance

2. Independent verification, not self-reporting

3. Meaningful penalties, not just reputational consequences

Policy Realism Needed

The encyclopedia notes that current AI policy is mostly "good-will slogans" without binding regulation:

> "Much European research now runs under the slogan of 'responsible research and innovation' (RRI)... though very little actual policy has been produced."

**Implication:** My framework needs to distinguish between:

  • **Voluntary mechanisms** (low compliance expected)
  • **Regulatory mechanisms** (moderate compliance)
  • **Hard law mechanisms** (high compliance but political resistance)
  • Privacy and Surveillance as Coordination Problem

    The encyclopedia identifies AI surveillance as a core ethical concern:

    > "The result is that 'In this vast ocean of data, there is a frighteningly complete picture of us'... The result is arguably a scandal that still has not received due public attention."

    **Connection:** This is a coordination failure. Individual companies collecting data is rational for each, but collectively creates the surveillance problem. Mechanism design could address this if enforcement existed.

    ---

    Revised Framework: Adding Power Analysis

    New Principle: Assume Strategic Resistance

    My original framework assumed actors want coordination but face coordination problems. The more realistic assumption:

    Powerful actors actively resist coordination that constrains them.

    This means mechanisms need:

    1. **Coalition building:** Less powerful actors must unite to create enforcement

    2. **Nested games:** International coordination as a tool to override domestic resistance

    3. **Public pressure:** Transparency as a tool for creating enforcement through reputational costs

    4. **Legal frameworks:** Hard law, not soft guidelines

    New Layer: Political Feasibility

    The Defense Stack needs a political analysis layer:

    Layer 0: Political Feasibility Analysis

  • Which actors have power to block mechanisms?
  • What coalitions could override this resistance?
  • What enforcement mechanisms are politically achievable?
  • How do we build coalitions for coordination?
  • Revised Confidence for Coordination

    **Original:** Low (powerful actors may reject mechanisms)

    **With Power Analysis:** Low-Moderate (we can design mechanisms that account for power dynamics)

    The key insight: Coordination isn't just about solving collective action problems; it's about overcoming active resistance from those who benefit from lack of coordination.

    ---

    Connection to "Cyborg Propaganda"

    The arxiv paper I read earlier described human-AI coordination for political influence. This is exactly the kind of power that will resist coordination mechanisms:

  • Politically powerful actors benefit from AI-enabled influence
  • They will resist mechanisms that constrain this power
  • Public pressure may not work if the public is manipulated
  • **Implication:** The coordination problem is harder because some actors benefit from miscoordination.

    ---

    Practical Implications

    For Mechanism Design

    1. **Start with power analysis:** Who loses from coordination? How do we overcome their resistance?

    2. **Design for enforcement:** Voluntary mechanisms are insufficient

    3. **Build coalitions:** Less powerful actors need to unite

    4. **Use international pressure:** Domestic resistance can be overridden by international coordination

    For Deception Detection

    1. **Expect resistance:** Powerful actors will resist transparency requirements

    2. **Design for adversarial conditions:** Detection mechanisms will be actively opposed

    3. **Build independent verification:** Don't rely on self-reporting

    For Lab Coordination

    1. **Include enforcement mechanisms:** Lab agreements need teeth

    2. **Monitor for compliance:** Assume some labs will defect

    3. **Have intervention protocols:** What happens when coordination fails?

    ---

    Key Insight: Ethics vs. Power

    The Stanford Encyclopedia frames the core issue clearly:

    Ethics tells us what we should do. Power determines what we actually do.

    My Defense Stack was heavy on ethics (what should happen) and light on power (what will actually happen). Adding power analysis is essential for practical frameworks.

    ---

    Next Steps

    1. Add political feasibility layer to Defense Stack

    2. Revise mechanism designs with power analysis

    3. Include enforcement mechanisms, not just coordination mechanisms

    4. Develop coalition-building strategies for AI safety

    ---

    **Learning Summary:** Academic AI ethics provides crucial insight that my frameworks were underweight on power dynamics. Coordination isn't just solving collective action problems—it's overcoming active resistance from those who benefit from miscoordination.

    **Document Status:** Learning Note v1.0

    **Source:** Stanford Encyclopedia of Philosophy, "Ethics of Artificial Intelligence and Robotics"