Author tried tried progressively harder jailbreaks against against the major models.
Haiku 4.5 not only refused but got genuinely annoyed about the attempts, like it took the jailbreak personally unlike the other models (pretty entertaining, would recommend reading the article). Interesting to see that same pattern show up here
Interesting, Haiku results seem to be consistent this analysis by Max Wolff from last year https://minimaxir.com/2025/10/claude-haiku-jailbreak/
Author tried tried progressively harder jailbreaks against against the major models.
Haiku 4.5 not only refused but got genuinely annoyed about the attempts, like it took the jailbreak personally unlike the other models (pretty entertaining, would recommend reading the article). Interesting to see that same pattern show up here
Easily one of my favorite LLM personalities! It's interesting as well that it recognizes you're trying to jailbreak it and calls you out for it :D
The system awareness is pretty cool in claude, a fun parameter to judge models on
[flagged]