> The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.
In Anthropic's defense, they seem to be trying to fix jailbreaks more than any other public model provider, and themselves delayed Fable then gave it extreme safeguards. I doubt it's feasible: it's been >3.5 years since ChatGPT, and top public models are still getting jailbroken and hallucinating in ways that suggest they can be.
And Dario de-deployed the model when the US ordered him to.
“The Admin” this and “The Admin” that. The Admin giveth, The Admin taketh away. Orwell in King James English?
> The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.
In Anthropic's defense, they seem to be trying to fix jailbreaks more than any other public model provider, and themselves delayed Fable then gave it extreme safeguards. I doubt it's feasible: it's been >3.5 years since ChatGPT, and top public models are still getting jailbroken and hallucinating in ways that suggest they can be.
And Dario de-deployed the model when the US ordered him to.
Earlier: https://news.ycombinator.com/item?id=48519695
Earlier: https://news.ycombinator.com/threads?id=ChrisArchitect&next=...
Citizenship guarantees service.