Data-Driven, Data-Informed

A coffee conversation with my friend @ericjackson last month got me thinking about two fundamentally different approaches to investing. Let’s call them “data-driven” and “data-informed”.

In data-driven investing, as far as possible, you allow the data to make the decisions. Asymptotically, you don’t want any human involvement at all; it’s all information and algorithms. Quant, systematic and high-frequency trading all fall in this category, as do index funds.

In data-informed investing, the data is just one input into a complex, non-linear, non-generalizable process that ultimately relies on human judgement. Discretionary funds, whether long, short, activist, distressed, event-driven or other, fall into this category.

The challenge with being purely data-driven is: very often, you just don’t have the right data, or you have it but there’s not enough to be statistically significant, or the world has changed and so the data isn’t relevant any more. There’s a finite set of circumstances under which your data and analysis and actions are robust enough to be fully automatable, and that set keeps diminishing due to competition for returns. Also, data-driven investing isn’t as ideologically pure as many of its adherents would have you believe: there’s still substantial human input into what data and which algorithms you choose to use or even test.

The challenge with being data-informed is: what happens when the data tells you to do something your prior analysis doesn’t agree with? Do you listen to the data or do you overrule it? The former discounts your intuition and experience (and might put you out of a job). The latter is just confirmation bias in action. Almost nobody is disciplined enough to update their priors correctly. Narrative fallacy is everywhere.

Data-driven and data-informed approaches tend to play out at different scales and over different time horizons. Data-driven investing assumes that the world will largely stay the same; tomorrow will look more or less like today; patterns will persist. By construction, that approach won’t capture the truly big moments: the once-in-a-lifetime opportunities that careers are made from. In contrast, data-informed investors will tell you they are “forward-looking” and hence not prisoners to the past, but under efficient markets, there’s no reason to believe their forward-looking forecasts will actually beat the market.

As a practical matter, it’s easy to become (or say you are) data-informed: just look at some new data, and you’re done. But it’s not easy to know if you’re being effective. Is the data really helping you make better discretionary decisions? Conversely, becoming truly data-driven is not easy; it requires a lot of time and effort and upfront investment and expertise. But once you’re doing it, it’s easy to know if you’re effective; it’s a matter of simple statistics.

I’ve been on both sides of this fence at different times of my career, and I don’t believe that one or the other is superior in all states of the world. But I don’t think you can follow both approaches at the same time. That way lies incoherence and confusion. As the quantity of data in the world continues to explode, investors will have to explicitly choose which side of the divide they want to be on.

The one choice that will not be available is staying data-blind: remaining wilfully ignorant of information that is relevant to your decision-making. Fortunately I have yet to meet a credible investor who believes this!


[1] The distinction between data-driven and data-informed is, I think, orthogonal to other well-known ways to categorize investment styles: value versus growth, for example, or trend-following versus mean-reversion, or even active versus passive.

[2] I started my trading career as a quant, looking for arbitrage and purely data-driven trades, but with broad discretion on trade entry, exit and sizing. As time went by, I tried to make my discretion redundant, by automating those decisions as well.

At Quandl, we evolved in the opposite direction. Many of our earliest ‘alternative data’ products were for data-driven investors: the data held a signal, and investors could trade directly on that signal to capture alpha. But over time we added a lot of data that spoke to data-informed investors, with different horizons and priors and investing styles.