Neural Networks and Why Data Matters for AI in Finance
This section of Chapter 3 is where things start to click. Hilpisch moves from talking about AI algorithms in general to showing how neural networks actually work. And then he drops a truth bomb that a lot of people skip over: your model is only as good as your data.
Let’s break both of these down.
The Problem with Traditional Methods
Hilpisch starts with something familiar. Old-school linear regression. You have some data points, and you want to draw a line through them. Simple enough.
He sets up a math function and tries to fit it with plain linear regression. The result? Not great. The line doesn’t bend, so it can’t follow curves. The error (measured by MSE, or mean squared error) is pretty high.
Then he adds more terms. Quadratic. Cubic. And suddenly the regression nails it perfectly. MSE drops to zero. But here’s the catch: you had to know the shape of the relationship in advance. You had to tell the model “hey, try a cubic term.” That worked because the original function was cubic.
In the real world, you almost never know the shape of the relationship you’re looking for. Especially in finance. And that’s where neural networks come in.
Neural Networks Don’t Need You to Guess
Here’s what makes neural networks different. You don’t need to tell them what kind of relationship to look for. You just feed them data and let them figure it out.
Hilpisch demonstrates this with Python using scikit-learn’s MLPRegressor. He builds a simple network with three hidden layers of 256 neurons each. Feeds it the same data. The MSE comes out tiny. Not perfect like the cubic regression, but really close. And the network had no idea the relationship was cubic. It just learned it.
Then he does the same thing with Keras and TensorFlow. He trains the network in rounds, and you can literally watch it get better with each pass. Round 1: rough approximation. Round 5: almost spot on. The error keeps dropping.
I think this is a great way to show the concept. You can see the network learning in real time, adjusting its weights each round to get closer to the truth. That’s the whole idea behind incremental learning. Start with random guesses, then keep nudging them in the right direction.
The Random Data Test
Then Hilpisch does something clever. He throws purely random data at both methods. No underlying pattern. Just noise.
OLS regression struggles badly. Even with 15 polynomial terms, the MSE stays high. It can’t find a pattern because there isn’t one to find with simple basis functions.
The neural network? It does way better. With nearly 200,000 trainable parameters across multiple layers, it can bend and twist to fit the random points much more closely. The MSE drops from 0.13 in round 1 down to 0.004 by round 7.
Now, is fitting random noise actually useful? Not really. But it proves the point. Neural networks have a massive capacity to learn complex relationships that traditional methods simply can’t handle.
Three Big Takeaways About Neural Networks
Hilpisch wraps up the neural networks section with three key characteristics:
Problem-agnostic. Neural networks don’t care if you’re doing estimation or classification. They handle both. Traditional methods like OLS regression work well for certain problems but fail on others.
Incremental learning. Instead of computing a closed-form solution, neural networks start with random weights and adjust them gradually. Each training pass compares predictions to actual values and backpropagates corrections through the network. This also means you can update a trained model with new data without starting from scratch.
Universal approximation. There are actual mathematical theorems proving that neural networks (even with just one hidden layer) can approximate almost any function. That’s a powerful guarantee.
These three things together explain why the book puts neural networks at the center of its approach to finance. They’re flexible, trainable, and theoretically capable of learning just about anything.
But Here’s the Thing: Data Is Everything
The second half of this section is where Hilpisch gets really practical. He asks: okay, neural networks are powerful, but what happens when you don’t have enough data?
He sets up a classification experiment. Ten binary features, 250 samples. The neural network trains on 70% of the data and gets tested on the remaining 30%.
In-sample accuracy? About 97%. The network memorized the training data beautifully.
Out-of-sample accuracy? 39%. That’s worse than flipping a coin.
Let that sink in. The model looked brilliant on training data but was worse than random guessing on new data. That’s overfitting in action.
Why Small Data Fails
Hilpisch explains the math. With 10 binary features, there are 1,024 possible patterns (2 to the power of 10). But the dataset only has 250 samples. So most patterns appear only once or not at all. The network can’t learn what’s real because it barely sees any pattern more than one time.
It’s like trying to learn what music a person likes by hearing them rate a single song. You just don’t have enough information.
More Data, Better Predictions
The fix? More data. Hilpisch multiplies the dataset by 50. Now all 1,024 possible patterns show up multiple times. Each pattern appears about 12 times on average.
The result: out-of-sample accuracy jumps to about 50%. And here, 50% is actually the correct answer. The data was randomly generated, so no real pattern exists. A perfect model should predict 50/50, which is exactly what happens with enough data.
The neural network went from confidently wrong (39%) to correctly uncertain (50%) just by having more data.
The Credit Scoring Reality Check
Then Hilpisch drops a real-world scenario. A bank wants to use neural networks for credit scoring. They design 25 features, each with 8 possible values. The number of possible patterns? About 37.8 sextillion. That’s a number with 22 digits.
No dataset in the world covers all those patterns. But Hilpisch says that’s okay for a few reasons. Not every pattern exists in practice. Not every feature matters equally. And similar feature values (like 4 vs 5 on some metric) often lead to the same outcome.
So you don’t need infinite data. But you need enough. And “enough” depends on the complexity of your problem.
What This Means for Finance
This section really drives home a point that matters for anyone building AI systems in finance. The algorithm is important, sure. But the data is what makes or breaks you.
AI practitioners and companies fight for improvements as small as a tenth of a percentage point. And the difference between a small dataset and a large one can be more than 10 percentage points. That gap is enormous.
If you’re building trading models, credit scoring systems, or risk assessment tools, the neural network architecture matters. But the volume and variety of your training data might matter even more.
That’s the honest takeaway from this chapter. Neural networks are incredible tools. But feed them bad data, or not enough data, and they’ll give you confident nonsense.
Previous: AI Algorithms: Types of Data, Learning, and Problems Next: AI Success Stories: From Atari to AlphaGo Part of the series: Artificial Intelligence in Finance Book by Yves Hilpisch | O’Reilly 2020 | ISBN: 978-1-492-05543-3