Scale outweighs built-in knowledge
Researchers in AI often aspire to build knowledge into their agents or algorithms, believing this to be the correct approach. If we can make the machine see and understand the world the way we think it should be, isn’t that the best we can do? Time and time again we’ve seen that building in knowledge generally leads to specialized algorithms that excel in the short-term, but get overrun by growing compute in the long-term.
Chess in the 90s, Go in the 2010s, and now more than ever with models like GPT-3, specialized, refined approaches have been outperformed by more general methods trained at a much larger scale. AlphaGo destroys any human player or hand-crafted engine (in both Chess and Go), and GPT-3 far and away generates the best text completion results, as well as performs well at a wide variety of tasks.
The Bitter Lesson here is that general purpose methods that scale with increasing compute are incredibly powerful, and appear to outweigh specialized algorithms in the long-term. This goes hand in hand with the idea that Minds are not simple, and trying to build in our understanding can restrict arbitrary complexity necessary for learning.