Hi - the author of the post here. I wanted to write up something that was a bit more than "Qwen is the goat" or "Cancelled Claude, run everything local now" or even "The model organised my CD collection, so it's great" - this is real from the trenches stuff. And what the model/harness did achieve may surprise you - along with how it ended up paying for itself.
Interesting, though I take mild semantic issue with this, in a way:
> I said that local models are not the same tool as SOTA. What did I mean by that?
I think you are misusing the important concept of the "state of the art". (It's an industry-wide error)
You can make a credible case, I think, that open weights models that can be run locally are exactly the "state of the art" in the academic/philosophical sense.
The key point is the meaning of "state of the art". It does not mean "cutting edge". In an important way, the state of the art is behind the cutting edge. Because the state of the art is that which is generally available to everyone.
I don't believe, as a matter of ethics and philosophy, that closed models whose dimensions and capabilities are essentially trade secret, wielded by two companies competing with each other for an IPO, can sensibly be called "state of the art".
I think therefore we should reclaim it: the state of the art is something that can only be embodied in claims that can be verified without constraint, source code that is accessible, etc. etc.; open weights models are closer.
I was persuaded of this by a comment here on HN some time ago that made the plausible case that FreeCAD is the "state of the art" in CAD, because it is the most generally available, open source, free-to-experiment-on implementation of core CAD concepts. A brave claim, for sure, but an interesting and profound one.
Hi - the author of the post here. I wanted to write up something that was a bit more than "Qwen is the goat" or "Cancelled Claude, run everything local now" or even "The model organised my CD collection, so it's great" - this is real from the trenches stuff. And what the model/harness did achieve may surprise you - along with how it ended up paying for itself.
Interesting, though I take mild semantic issue with this, in a way:
> I said that local models are not the same tool as SOTA. What did I mean by that?
I think you are misusing the important concept of the "state of the art". (It's an industry-wide error)
You can make a credible case, I think, that open weights models that can be run locally are exactly the "state of the art" in the academic/philosophical sense.
The key point is the meaning of "state of the art". It does not mean "cutting edge". In an important way, the state of the art is behind the cutting edge. Because the state of the art is that which is generally available to everyone.
I don't believe, as a matter of ethics and philosophy, that closed models whose dimensions and capabilities are essentially trade secret, wielded by two companies competing with each other for an IPO, can sensibly be called "state of the art".
I think therefore we should reclaim it: the state of the art is something that can only be embodied in claims that can be verified without constraint, source code that is accessible, etc. etc.; open weights models are closer.
I was persuaded of this by a comment here on HN some time ago that made the plausible case that FreeCAD is the "state of the art" in CAD, because it is the most generally available, open source, free-to-experiment-on implementation of core CAD concepts. A brave claim, for sure, but an interesting and profound one.
[dead]