Occam’s Razor is a cornerstone of the social sciences, and for financial economists it is almost an article of faith. The principle is named after William of Ockham, a 14th-century monk. It holds that the simplest explanation for any phenomenon is the best. Financial analysts today live in fear of “overfitting”: producing a model that, by dint of its complexity, maps onto existing data well, while predicting the future poorly. Now, though, Ockham is on trial. New research suggests that, when it comes to big machine-learning models, parsimony is overrated and complexity might be king. If that is true, the methods of modern investing will be upended.

The debate began in 2021, when Bryan Kelly and Kangying Zhou of Yale University, and Semyon Malamud of the Swiss Federal Institute of Technology in Lausanne, published “The Virtue of Complexity in Return Prediction”. In one exercise, Mr Kelly and his co-authors analysed just 12 months of data using a model with 12,000 separate “parameters”, or settings. Using so many of them—the opposite of what Occam’s razor prescribes—would traditionally have been thought to raise the risk of overfitting. Yet the complexity in fact seemed to help the model forecast the future. Might Occam’s razor, the paper’s authors asked, be Occam’s blunder?

It is an academic debate, but the outcome will have sweeping consequences. Mr Kelly is also a portfolio manager at AQR, a quantitative hedge fund. The firm was once known for using more traditional—and parsimonious—methods than its peers. But it is now embracing the apparent virtues of complexity. Researchers, worried about overfitting data, have worried too little about underfitting it, reckons Mr Kelly.

Making better predictions with small data sets could be enormously profitable. Much financial research is strangled by small sample sizes and the difficulty of conducting experiments. Gathering more data often requires waiting, and in some areas it is incredibly sparse. When studying extreme events such as market blow-ups, bank runs and sovereign defaults, researchers often have just a few examples in modern history. Hedge funds in search of an edge spend billions of dollars on alternative data, from satellite images of Chinese rail traffic to investor sentiment scraped from social media.

Recently, the debate over complexity has reached fever pitch. Mr Kelly and his co-authors have faced a barrage of scepticism. Álvaro Cartea, Qi Jin and Yuantao Shi of the University of Oxford suggest the virtues of complexity may not hold if the data used is poorly collected, erroneous or otherwise noisy. Stefan Nagel of the University of Chicago suggests that for very small data sets, the supposedly complex models actually mimic a momentum-trading strategy, and that their success is a “lucky coincidence”. Messrs Kelly and Malamud have responded to their respondents with another detailed paper.

It is too soon to prepare a eulogy for Ockham’s maxim. But even the sceptics do not outright reject the idea that big, complex models can produce better forecasts than simpler ones—they just think this might not be true at all times. Meanwhile, if the virtues of complexity are real, the changes to how many investors operate could be immense. Hiring the best machine-learning engineers will be more important than ever, and so will acquiring and cleaning data, if Mr Cartea and his co-authors are correct. The billion-dollar pay packets that tech firms offer superstar coders may begin to pop up at investment firms, too.

Investment firms will also see greater benefits from scale. The computational power required to train and run models is expensive, and so may become a “moat” protecting large hedge funds from competition. Larger players will be able to afford to experiment more, and across a wider range of asset classes. Smaller peers may struggle to keep up.

Reduced competition is not the only risk. Humans are still catching up when it comes to interpreting what the most advanced machine-learning models are doing. Investors may become increasingly reliant on black-box algorithms that are extremely difficult to interpret. Small models benefit from being not only easy to deploy, but easy for investors to think about, and to tweak. Few will complain so long as they are making money. Yet if anything goes wrong with the new models—ranging from mundane underperformance to entire investment strategies blowing up—their fans may find themselves wishing for a tool that could cut through the complexity.