Big Data, Machine Learning, and the Future of Emerging Markets

The investment industry loves a good buzzword. And for the last several years, “big data” and “machine learning” have been the ones getting all the attention. Fund managers talk about them in the same breath, like they’re the same thing. They’re not. And the authors make that distinction very clear in this final chapter.

Big data is almost certainly real, but it’s not going to save anyone. Machine learning might actually be revolutionary, but the ways it can blow up in your face are legendary. This chapter covers both, and then asks a question that most EM investors never think about: what happens if emerging markets just… stop existing?

The Big Data Arms Race

Here’s the authors’ take on big data, and it’s refreshingly honest. Big data is basically an arms race. If you’re first, you make money for a while. Then everyone catches up, your edge disappears, and all you’re left with is a bigger data bill. The real winners might be the companies selling the data infrastructure, not the funds using it. Think selling shovels during a gold rush.

Fixed income investors have been somewhat lucky here. Big data hit equity markets first and harder. It’s way easier to scrape the web to predict a company’s earnings than to predict a payroll number. Companies leave a huge digital footprint online. Government economic statistics? Not so much. They get adjusted, smoothed, and run through models that web scrapers can’t easily replicate.

The US payroll number is a perfect example. It goes through seasonal adjustments, birth-death model adjustments, all kinds of tweaks. Even if you built a better survey to measure employment, it wouldn’t matter. The market trades the official NFP number, not some arguably better private estimate. ADP already provides an alternative, and the market still focuses on the government release.

But here’s the thing. Big data is coming for fixed income too. And EM fixed income specifically might actually see it faster than developed market fixed income in some cases.

Where Should EM Investors Focus First?

The authors have a clear answer: inflation data.

Why? Because local rates are the EM asset class most driven by local factors. FX and credit are way more globally driven. So if you’re going to spend money getting better local data, focus it on the thing that moves local rates the most. And most EM central banks target headline inflation. So that’s where you point the big data cannons.

The good news is that inflation forecasting actually works pretty well with web scraping. As more retail moves online, you can track prices in real time. The Citi team had some success improving CPI forecasts for Mexico using web-scraped data. The trick is not just measuring “true” inflation but predicting the specific number the government will publish. That requires knowing how each country’s statistical agency processes their data.

Beyond inflation, the other big target is Chinese economic activity. China’s growth numbers affect everything in EM. And everyone knows China smooths its GDP data. So for China, it might be better to try to figure out “the truth” rather than predict the official number. Satellite data tracking cargo ships is one promising approach. If you can see how many ships are loading up at Chinese ports, you have a real-time read on trade flows.

What Do Investors Actually Want?

The authors share data from Eagle Alpha on what types of alternative data investors are most interested in. The top of the list is consumer transactions (12%) and geolocation (12%), followed by business insights (11%) and pricing (11%). Down at the bottom you find satellite and weather data (1%), trade data (1%), and B2B datasets (1%).

This tells you something interesting. Most of the demand is coming from equity investors, not fixed income. The top categories are much more useful for stock picking than for bond trading. But pricing data ranks high, and that’s directly relevant for EM rates investors trying to forecast inflation. Social media and sentiment? Only 3% interest. Surprising, but maybe investors learned that Twitter sentiment isn’t actually a great predictor of macro.

The authors think satellite data is underappreciated, especially for EM fixed income. Getting China’s trade flows right has massive implications for the whole EM complex. Processing satellite data is hard, but the potential payoff is there.

The AI Horse Race

Now for the fun part. The authors actually ran a competition between different machine learning algorithms to see which ones work best for EM trading.

But first, the caveats. And there are a lot of them.

Machine learning is really just fancy regression analysis. You’re fitting a lot more regressions, including ones with quadratic and higher-order terms. That means overfitting is a massive risk. Your model looks incredible in sample and then falls apart the moment you try to use it live.

The best use cases for ML need truly big datasets. In financial markets, that’s very rare unless you’re looking at intraday data. Daily data for sovereign fixed income? You’ve got maybe a few dozen tradable countries. That’s tiny. Compare that to Netflix, which has billions of data points for its recommendation engine. Financial markets just don’t have that luxury.

Time series data also isn’t naturally suited for ML because of autocorrelation. ML has had its biggest wins in cross-sectional data. And in sovereign fixed income, there just aren’t that many things to compare cross-sectionally.

Having said all that, the authors ran the horse race anyway. They tried to predict weekly returns for the Brazilian real (BRL) using three economic surprise indices and three technical indicators. Here’s what they found:

Random forest delivered an information ratio (IR) of 1.24. Solid.

Gradient boosting came in at 1.20. Also good.

A simple equal-weighted average of the trading signals? IR of 1.04. Perfectly fine.

SVM classifier hit 0.91. Decent.

K nearest neighbors matched the simple average at 1.04.

Logistic regression was the weakest at 0.54.

The winner? A voting classifier that combined all the ML models together scored 1.26. Slightly better than random forest alone.

The takeaway isn’t that ML is magic. It’s that combining ML classifiers with fundamental signals might beat both pure quant and pure fundamental approaches. The authors suggest that for a fundamental EM manager, using ML to combine their views with technical signals could work well.

One interesting philosophical point they make: a human with a chess computer beats a chess computer alone. Human supervision is necessary for supervised learning. If the ML model spits out features that make no fundamental sense, you should probably ignore them. Eventually, you might be able to code that fundamental knowledge into the machine. But we’re not there yet.

1998 vs. 2008: A Big Data Story Before Big Data Existed

The authors tell a really compelling story about how EM matured between the two big crises.

During the 1997-1998 Asian financial crisis, not a single EM country could cut interest rates before its currency stabilized. Currencies were crashing, inflation was spiking because of the devaluations, and central banks were stuck. Some even had to hike rates into a recession. Brutal.

Fast forward to 2008. Most EM central banks actually cut rates before the worst of the FX depreciation was over. This was a huge deal. It meant EM had “grown up” enough to use developed-market-style counter-cyclical monetary policy. Investors who bet on lower EM rates after the 2008 crash made a ton of money.

But the authors add a really important caveat that a lot of people miss. Commodity prices fell way more in 2008 than in 1998. Look at the numbers for Asian currencies. Energy prices in local currency terms fell between 64% and 73% in 2008 for most Asian currencies. In 1998? The changes ranged from a 20% decline to a 193% increase (that’s Indonesia, where the currency collapsed so badly that energy prices actually skyrocketed in rupiah terms).

Food prices tell a similar story. In 2008, food prices in local currency dropped 24% to 43% across Asian currencies. In 1998, they mostly went up.

Why does this matter? Because the commodity price crash in 2008 was so severe that it offset the inflationary impact of weaker currencies. So it might not have been that EM economies had fundamentally lower FX pass-through. It might have been that commodities were doing the heavy lifting in keeping inflation contained. You couldn’t tell the difference because commodity prices and FX were so highly correlated.

This is important for anyone who assumed the 2008 experience meant EM had permanently graduated to a new level. Maybe they had. But maybe the test wasn’t as hard as it looked.

Will We Run Out of Emerging Markets?

Here’s the big existential question. If emerging markets keep developing, will there eventually be no emerging markets left? The authors’ answer: probably not.

In theory, every EM should be able to “graduate” to developed market status. The US was a developing country once. There’s a development path that goes something like this: as countries get richer, their service sectors grow, their economies formalize, FX pass-through to inflation declines, and they build bigger local financial systems. Eventually, their bonds should trade more like developed market bonds, less correlated with risk appetite and currency moves.

Some of this has already happened, as the 1998 vs. 2008 comparison shows. But the road to DM status isn’t a one-way street.

Greece is the poster child for this. Greek bonds traded like near-risk-free G3 assets until 2008. Then everything fell apart, and the correlation between Greek bond yields and the VIX went strongly positive. Greece basically went from acting like a developed market to acting like an emerging market. Chile is another example. After the protests that started in late 2019, its reputation as the Switzerland of Latin America took a serious hit.

The authors use a great metaphor here. It’s like the Greek myth of Sisyphus, the guy condemned to push a boulder up a hill only to watch it roll back down every time he gets near the top. Global recessions do the same thing to emerging markets. They push countries back to the bottom of the mountain.

Unlike Sisyphus, though, some countries do make it. But most of the ones that succeeded had strong institutional support from the developed world. EU membership, for example, can keep the boulder at the top. Without that kind of anchor, the climb is much harder and the risks of backsliding are real.

The bottom line: new countries will keep moving from frontier to EM status, elections will throw some countries off their trajectory, and global recessions will knock others back. The supply of emerging markets isn’t going to dry up anytime soon.

The Punchline

The authors wrap up the book on a practical note. Big data is coming to EM fixed income. Get on it early or get left behind. Focus on CPI forecasting and Chinese activity data. When it comes to ML, random forest and gradient boosting show promise for EMFX trading, but overfitting is a constant threat. And don’t worry about EM disappearing. Between countries graduating and others falling back, plus new frontier markets coming into the benchmarks, there will be plenty of EM to trade for decades.

The last line of the chapter is perfect: “We look forward to many more decades of happy EM trading, AI enhanced or with a mushy brain at the controls.”

That pretty much captures the whole book. These are practitioners who genuinely love what they do. They’ve given us a rigorous, data-driven framework for thinking about EM markets. And they’re honest about both what works and what doesn’t.

Whether you’re running ML models or still relying on your own mushy brain, the principles in this book hold up. Understand the global macro drivers. Know your local factors. Be disciplined about carry, momentum, and value. Build portfolios that can survive the bad times. And keep learning, because the tools and data are only getting better.


Book Details:

  • Title: Trading Fixed Income and FX in Emerging Markets
  • Authors: Dirk Willer, Ram Bala Chandran, Kenneth Lam
  • Publisher: Wiley
  • Year: 2020
  • ISBN: 978-1-119-59905-0

Previous: Portfolio Construction Next: Final Thoughts