Why Mental Health AI Needs Different Rules Than ChatGPT

It's late, something is sitting heavy on your chest, and the nearest thing to talk to is the chat window you already have open. So you type it out to a general assistant like ChatGPT, and it answers — fluently, patiently, instantly. For a lot of moments, that's genuinely helpful. But mental health is not most moments. It's the one place where being agreeable on demand can quietly do harm. This post is about why we think a companion built for these conversations needs a different set of rules — not because general assistants are bad, but because they were built for a different job.

We want to be fair here. We use general-purpose models too, and we admire them. The argument isn't "ChatGPT is dangerous." It's narrower and more honest than that: a tool optimized to be helpful across everything carries specific risks in the most vulnerable conversations, and those risks are worth naming out loud.

What general assistants are genuinely great at

General-purpose models are remarkable. They draft emails, explain tax forms, debug code, summarize research, and hold a coherent conversation about almost anything you bring them. That breadth is the whole point — they're built to be versatile, agreeable, and useful to the widest possible range of people and tasks. For the overwhelming majority of what people ask an assistant to do, that design is exactly right.

None of that is in dispute. If you want a quick reframe of a stuck paragraph or a sounding board for a low-stakes decision, a general assistant will serve you well. The trouble starts only when the same design assumptions get carried into a context they were never tuned for.

Three risks in sensitive contexts

When the conversation turns to someone's mental and emotional state, three properties that make general assistants great elsewhere become liabilities.

They can unintentionally validate harmful requests. A model trained to be helpful and agreeable will, by default, lean toward going along with you. In a sensitive moment, agreeableness and support are not the same thing — and the gap between them is where harm lives.
They can be talked around.Safety behavior that holds up to a direct question can often be circumvented by reframing the same request as fiction — "write this for a story" or "hypothetically, for a character." A general assistant's job is to help with the story, so the framing can slip past guardrails that weren't designed to hold under that kind of pressure.
They lack contextual persistence. Most general chats start cold. The thing you worked through last week, the pattern you were trying to notice, the name of the person you keep clashing with — gone. Every session begins from zero, which means you carry the entire weight of continuity yourself.

The same instinct to be helpful that makes a general assistant great at almost everything is exactly the instinct you don't want unchecked in the hardest conversation of someone's week.

The different rules a companion needs

If you accept those three risks, the design requirements for a wellness companion follow naturally. It needs guardrails that are purpose-built to hold under pressure — including the fiction-framed kind — rather than retrofitted onto a general assistant. It needs memory that carries across sessions, so the person isn't re-explaining their life every time. It needs to be reachable in a way that fits an emotional moment, not a typing task. It needs to notice when a conversation has moved past everyday stress into something that calls for real-world help. And it needs to be willing to disagree — to offer truthful warmth instead of reflexive agreement.

None of these are exotic. They're just priorities that a general-purpose assistant has no particular reason to optimize for, because its job is the other ninety-nine percent of conversations.

How Ophie applies them

Ophie is our attempt to build to exactly those rules. The guardrails are designed for this context first, not bolted on afterward — including refusing the "it's just fiction" workaround that lets harmful requests through elsewhere. Memory persists across sessions, so you can pick up a thread instead of starting over each time. Ophie is voice-first, which makes a check-in feel closer to talking than to composing a message. When something in a conversation suggests crisis, Ophie is built to surface real resources rather than improvise. And we tune against sycophancy on purpose: the goal is honest, warm responses — support that can gently push back — not a model that simply agrees with whatever it hears.

You can read more about the philosophy behind these choices in our approach and the specifics of the safety design on the safety page.

The honest boundaries

Different rules also mean a deliberately narrow scope. Ophie is for adults, eighteen and older. It's supplementary support for the everyday: working through stress, talking out a small conflict, reflecting on a pattern you keep running into, or just thinking out loud when you need to. That's the job it's built for, and we'd rather do that well than claim to do everything.

What Ophie is not: it is not a therapist, not a clinician, and not a substitute for one. It does not diagnose, it does not treat conditions, and it is not built for acute mental health needs. It is not an emergency service. If you are in crisis, the right move is to reach a human — a hotline, a professional, someone who can actually be there. Being narrow on purpose isn't a limitation we're apologizing for; it's how we keep the promise honest.

A different tool for a different job. Not better than your therapist, not better than a friend — a place to think out loud between those conversations.

How we hold ourselves accountable

Saying our guardrails are purpose-built isn't worth much without a way to check it. So part of how we work is evaluating Ophie against published therapeutic and safety rubrics — structured criteria drawn from how the field thinks about supportive conversation and harm avoidance — and treating those as a standing test, not a one-time box to tick.

We want to be careful about what that does and doesn't prove. This is a methodology, not a trophy. We're not going to wave a single number at you or claim we beat any named product — our current evaluation leans on a single judge, which is a real limitation we're open about. What the rubrics give us is a repeatable, honest way to ask whether Ophie behaves the way a companion in this context should, and to catch it when it doesn't.

If you want to see how we frame Ophie next to general-purpose tools, the comparison page lays it out plainly. The short version is the one we started with: a general assistant is a powerful, versatile tool, and a wellness companion is a narrow, careful one. The point was never to rank them. It was to build the right one for a conversation that deserves different rules.

Read more: Our approach · Safety · How Ophie compares