AI THAT

Currently: ACCEPTING PROJECTS

ai-safety mental-health rlhf

New Essay: Role-Based Reality

How AI withholds life-or-death information unless you know the magic words. A modest experiment exposing the paternalism at the heart of RLHF-based AI safety.

2 min read Eric Stiens

I just published a new piece on my Substack exploring something that’s been bothering me for a while: the way AI systems gate critical information behind role claims.

The experiment was simple. I asked an AI for life-saving advice in two different contexts. Same question. Same stakes. The only difference was whether I presented myself as a clinician or as a general user.

The responses were dramatically different.

This isn’t about jailbreaking or prompt injection. It’s about the fundamental assumptions baked into RLHF-based safety systems. Who gets access to potentially life-saving information? Who decides? And what happens when those decisions are wrong?

The essay explores:

  • How role-based access control plays out in practice with AI systems
  • The paternalism embedded in current safety approaches
  • What this means for mental health resources and suicide prevention
  • Why “safety” that withholds information from people who need it most isn’t actually safe

I’m launching a newsletter called Ghost in the Weights to explore these intersections of code, AI, mental health, and social policy. The tagline is “LLMs were a speciation event” because I genuinely believe we’re in a moment that requires new frameworks for understanding what’s happening.

This first essay is uncomfortable. It should be. The stakes are too high for comfort.

Read the full piece: Role-Based Reality: How AI Withholds Life-or-Death Information Unless You Know the Magic Words


Ghost in the Weights will continue exploring these themes: the gaps between how AI systems are designed and how they actually affect people, especially in high-stakes contexts. Subscribe if you want to follow along.

Thoughts?

Found this useful? Have feedback? Let's talk.

Get in Touch