That's why they have an "auto" mode that you should be using over the "bypass permissions". no matter how many guardrails you put on it. AI is not deterministic, so sometimes it likes to do whatever it wants.
no, I honestly just think it confused itself in the middle of operations. I didn't give the full log for the sake of readability but there's no trick up my sleeve, the story is what it is.
i honestly don't understand. In which case did you find it more powerfull then e.g. latest Opus? i also used it a lot and did not find the different
p.s. it generates radically different designs but after many iterations you understand that they are also all the same - it's like they've just RLed it in a different way
That's why they have an "auto" mode that you should be using over the "bypass permissions". no matter how many guardrails you put on it. AI is not deterministic, so sometimes it likes to do whatever it wants.
And it does free flight with greater confidence now, leading potentially to bigger failures.
> a relative-path rm -rf executed after the shell's working directory had been reset to the repo root, without re-checking pwd first.
Did you switch the working directory manually?
no, I honestly just think it confused itself in the middle of operations. I didn't give the full log for the sake of readability but there's no trick up my sleeve, the story is what it is.
i honestly don't understand. In which case did you find it more powerfull then e.g. latest Opus? i also used it a lot and did not find the different
p.s. it generates radically different designs but after many iterations you understand that they are also all the same - it's like they've just RLed it in a different way
Play stupid games…
[flagged]