AI Success Stories: From Atari to AlphaGo and the Hardware Behind It

Jan 19, 2019
Book Reviews

Chapter 4 of Artificial Intelligence in Finance opens with something fun: stories about AI beating humans at games. And honestly, these stories are some of the most fascinating parts of AI history. Games sound trivial, but they’re actually perfect testing grounds for intelligence. If a machine can figure out a game on its own, what else can it figure out?

DeepMind and Atari: Learning Like a Kid

Here’s the setup. The Atari 2600 came out in 1977. Games like Space Invaders, Asteroids, Breakout. Total classics. Simple by today’s standards, but they still require real decision-making from the player.

In 2013, a company called DeepMind published a paper that turned heads. Their team built an AI agent that could play seven Atari 2600 games using reinforcement learning and a neural network. The agent beat human expert scores on three of them.

But here’s what made it special. The AI didn’t get any instructions. No tips, no strategy guides. It just looked at the raw pixels on screen and figured out what to do through trial and error. Same way you’d learn Breakout as a kid: try stuff, see what works, do more of that.

And they used one single neural network for all seven games. Not seven different programs, one. That’s like hiring one person who can pick up any new job just by watching.

Hilpisch walks through a code example using OpenAI Gym’s CartPole game to show this in action. An AI agent learns to balance a pole on a cart through trial and error. After training on data from its own good attempts, it scores a perfect 200 out of 200 on every single run. A hundred games, all perfect.

That’s reinforcement learning in its purest form. No human telling the machine what “good” looks like. It just figures out what works.

AlphaGo: The Game Nobody Expected AI to Win

Go is over 2,000 years old. It’s played on a 19x19 board, and the number of possible board positions is larger than the number of atoms in the universe. That’s not an exaggeration. It’s just math.

In 2014, Nick Bostrom, one of the top AI researchers, predicted it would take about a decade for AI to beat the best human Go players. He was off by about nine years.

DeepMind’s AlphaGo beat the European Go champion Fan Hui 5-0 in 2015. Some people shrugged. Fan Hui was strong, but he wasn’t the best in the world.

So DeepMind went bigger. In March 2016, AlphaGo played Lee Sedol, the 18-time world champion who held a 9th dan ranking. Over 200 million people watched. AlphaGo won 4-1. It even earned a 9th dan ranking itself, the first time a computer ever received that honor.

That was already incredible. But then they built AlphaGo Zero. This version didn’t study any human games at all. Zero human data. It learned entirely through self-play, competing against different versions of itself. The result? AlphaGo Zero beat the original AlphaGo 100-0. Not 100-1. Not 99-1. A hundred to zero.

Centuries of collected human Go wisdom? Turns out you don’t need it. That’s both humbling and a little unsettling if you think about it too long.

Chess: From Brute Force to Real Intelligence

Chess has a different AI story. Computer chess programs go way back. Hilpisch mentions ZX Chess from 1983, a chess engine that ran in just 672 bytes on a ZX-81 Spectrum. It couldn’t even do castling, but squeezing any chess engine into that space was impressive. That record for smallest chess program stood for 33 years.

The big moment came in 1997. IBM’s Deep Blue beat Garry Kasparov, the world chess champion. But here’s the thing about Deep Blue: it wasn’t really “intelligent.” It was a $10 million supercomputer with 30 nodes and 480 special chess chips that could analyze 200 million positions per second. Pure brute force. Check every possible move, pick the best one. No learning, no intuition.

Kasparov had gone 32-0 against computers in 1985. Twelve years later, the machines caught up, mostly by throwing hardware at the problem.

Fast forward to 2017. Kasparov points out that any free chess app on your phone can now rival a human Grandmaster. The cost of beating a Grandmaster dropped from $10 million to about $100. A factor of 100,000.

But traditional chess engines like Stockfish still rely on thousands of human-designed rules and heuristics. They’re powerful, but they’re essentially playing with a really good cheat sheet built by decades of human chess knowledge.

Then AlphaZero showed up. Same approach as AlphaGo Zero: start from scratch, learn the rules, play yourself millions of times. After just 9 hours of training, AlphaZero was beating Stockfish. In a 1,000-game test, AlphaZero won 155 games, lost only 6, and drew the rest.

And here’s the wildest stat. Stockfish analyzes about 60 million positions per second. AlphaZero? About 60,000. A thousand times fewer. But it still wins. It’s not looking at more positions. It’s looking at better positions. That feels a lot closer to how humans actually think about chess: pattern recognition, intuition, strategic foresight. Just faster.

Why Hardware Is the Unsung Hero

All these success stories have something in common: none of them would have happened without massive improvements in hardware.

Hilpisch includes a table tracking AlphaGo’s hardware needs across versions, and the trend is striking:

AlphaGo Fan (2015): 176 GPUs, over 40,000 watts of power
AlphaGo Lee (2016): 48 TPUs, around 10,000 watts
AlphaGo Master (2016): 4 TPUs, under 2,000 watts
AlphaGo Zero (2017): 4 TPUs, under 2,000 watts

The system got smarter and more efficient at the same time. Better algorithms meant less hardware needed. Better hardware meant more capability per chip.

GPUs were originally built for video games. They need to do tons of parallel math to render graphics. Turns out, that same parallel math is exactly what neural networks need. A top consumer GPU in 2019, the Nvidia GTX 2080 Ti with 4,352 cores, could hit about 15 TFLOPS. That’s roughly 15 times faster than the best consumer CPU. And the price? About $1,400. Compare that to what similar compute power cost a decade earlier. Suddenly AI research isn’t just for billion-dollar companies.

Then there’s the cloud. Rent GPUs and TPUs by the hour. Google even built TPUs specifically for AI workloads. You don’t need to own the hardware. You just need a credit card and a good idea.

Hilpisch boils it down to three points: performance keeps going up, costs keep coming down, power consumption keeps shrinking.

Why This Matters for Finance

Hilpisch makes a subtle point here. These game-playing achievements might seem disconnected from finance. But the same techniques that learned Breakout from raw pixels and mastered Go through self-play are being applied to financial markets. Reinforcement learning, neural networks, pattern recognition without human-designed rules.

Financial markets are complex. Possibly more complex than Go. But if an AI can find patterns in a game with more positions than atoms in the universe, what could it find in market data?

That’s the promise. And looking at the hardware trajectory, the tools to chase that promise are getting cheaper and more powerful every year.

This post is part of a series on “Artificial Intelligence in Finance” by Yves Hilpisch (O’Reilly, 2020, ISBN 978-1-492-05543-3).

Previous: Neural Networks and Why Data Matters

Next: Superintelligence: Forms, Paths, and the Control Problem

#artificial-intelligence-in-finance #yves-hilpisch #book-retelling #deepmind #alphago #artificial-intelligence #gpu-computing #chess-ai

AI Success Stories: From Atari to AlphaGo and the Hardware Behind It

DeepMind and Atari: Learning Like a Kid

AlphaGo: The Game Nobody Expected AI to Win

Chess: From Brute Force to Real Intelligence

Why Hardware Is the Unsung Hero

Why This Matters for Finance

About

About BookGrill

Category

Tags View all tags

Theme Settings

Accent Color