W28 - "The Bitter Lesson"

I only recently grasped the significance of the short essay “The Bitter Lesson.” Published on Sutton’s personal blog in 2019, it distills the biggest lesson from 70 years of AI research: the true drivers of AI progress are general methods that fully leverage compute, not domain-specific tricks that depend on human expert knowledge. It cites examples in speech recognition, natural language processing, and computer vision. Early work largely relied on expert-designed features and rules, while later statistical methods and the rise of deep learning far outperformed those approaches.

The evolution of complex systems and intelligence follows a similar pattern: progress comes not from human design but from mechanisms that can adapt to environments and discover and scale solutions autonomously.

I also came across a 2019 book — Deep Learning: The Core Driving Force of the Age of Intelligence. Its author, Terrence Sejnowski, is one of the early advocates of neural networks and, along with Hinton, co-invented the Boltzmann machine, so he’s highly authoritative. The book was popular then; AlphaZero’s breakthroughs had already drawn broad market and policy attention to AI. BERT and GPT at the time were mostly technical terms discussed within the field, and the book doesn’t even mention them.

I bought the hardcover to ride the hype. Although it’s written for a general audience, without prior exposure the material felt distant and hard to engage with. Revisiting it now feels completely different. What once felt like hype now looks more like focused attention rather than mainstream breakthrough. Because of that, reading it today is quite interesting. Having witnessed the development of AI over these years, many names and terms from back then suddenly feel familiar, and many earlier episodes now echo later developments.

It’s a curious experience: things that once seemed unfamiliar and abstract have become subjects I can understand and discuss.

Last updated