Why a +2891% return doesn't mean you should buy

A stock thirty times higher in a year sits at the top of my momentum list. Two simple statistical tricks keep it from breaking my whole system.

Right now my trading system is calculating momentum scores for 552 stocks. At the top: SNDK, also known as SanDisk. With a return over the past twelve months of +2,891 percent. Thirty times higher in a year.

My first reaction was understandable: this should go to the top of our buy list. My second reaction, after thinking about it, was different: this is exactly why I need two statistical tricks to keep my system from falling apart.

Let me explain.

The problem: one outlier can dominate an entire system

My trading system combines four things to assess a stock:

Value: how cheap is the company versus peers?
Quality: how healthy are the fundamentals?
Momentum: how is the price trending?
Catalyst: are there recent positive events?

Each stock gets a score on all four, and they're combined into one final score. The idea: only stocks that score well on multiple things at once make it to the top. Not stocks that are extreme on one thing.

But SanDisk at +2891% is literally 18 standard deviations above the average of all 552 stocks. For those unfamiliar with statistics: that's like someone being 12 meters tall. Impossible in a normal world, but SanDisk's price had done it.

The result: even if SanDisk's Quality, Value, and Catalyst are all mediocre, momentum alone would put it at number one. My whole philosophy of "buy balance, not extreme" would break because of one number.

That's a bug, not a feature.

Trick 1: z-scores

A z-score tells you how many standard deviations an observation deviates from the mean. The formula is simple:

z = (value - mean) / standard deviation

For my 552 momentum observations:

Mean return: 36 percent over twelve months
Standard deviation: 156 percent

SanDisk's z-score: (2891 - 36) / 156 = 18.3

A z-score of 18.3. That's beyond all proportion.

For reference: 95% of all stocks have a z-score between -2 and +2. 99.7% sit between -3 and +3. SanDisk's 18 is a statistical event you'd expect roughly once every hundred thousand years in a normal distribution.

But SanDisk had done it. Not because the world is undivided, but because stock returns don't follow the normal distribution. Some stocks triple, some halve, and rarely a few rise ten to thirty times. This is a known statistical phenomenon: fat tails, thick tails in the distribution.

My factor model is not designed for tails like that. Z-scores normalize values so the mean is 0 and standard deviation is 1. But if one observation is beyond all proportion, even a z-score can still dominate.

That's where trick 2 comes in.

Trick 2: winsorization

Winsorization sounds fancy but it's simple: cap any values that fall outside a certain range.

In my system that range is [-3, +3]. Any score above +3 becomes 3. Any score below -3 becomes -3. Done.

For SanDisk that means: his z-score of 18.3 becomes simply 3.0. He's still a momentum extreme, but no longer dominant. Other factors get a chance again.

That seems arbitrary, and it is. Why +3 and not +5? Why a fixed cap and not a logarithmic scale? Those are design choices with trade-offs.

But the idea behind winsorization has a serious theoretical foundation. If you build a statistical measure that depends on the mean or standard deviation, and you let extreme outliers exist freely, those outliers throw your whole system into chaos. One SanDisk shifts the average, increases the standard deviation, and all other stocks are wrongly seen as "less extreme."

By first winsorizing and then z-scoring, you limit the influence of outliers while preserving the information from the signal. SanDisk still tops the momentum list, but now with a score in the realm of what's realistically digestible for the rest of the system.

What it actually does for me

My system saw this for SanDisk:

SanDisk: Momentum z-score 3.0 (capped), Value -1.2, Quality 0.4, Catalyst -0.8
Composite score: 0.43

Top of the momentum list, yes. But absolutely not at the top of the overall buy list. With a Value of -1.2 (overpriced) and a Catalyst of -0.8 (no recent positive events), the system is skeptical. A purchase only follows if multiple factors align positively.

A stock that's gone thirty times higher could be on its way to a bubble. Or it could be undergoing a fundamental transformation. My system doesn't know the difference. What it does do: not blindly follow what momentum alone says.

For other builders

If you ever build a ranking system that combines factors, think about outliers. It seems like extremes "naturally" rise to the top, and that's exactly what you want. But one extreme in one variable can undermine your entire philosophy.

Two simple tricks solve that. First normalize values into z-scores, so they sit on the same scale. Then winsorize, so outliers are capped at a reasonable boundary. Two lines of code per factor, but the difference between a working system and a disaster.

My factor model has had these measures from day one. Not because I'm brilliant, but because I've read enough quantitative literature to know this is standard. What I didn't know: how strong the difference is. Only when I saw SanDisk in the data did it sink in.

+2891% sounds like a buy signal. For a human. For my system it's just one stock that's extreme on one factor, and that's not yet enough.

Follow weekly?

Follow on LinkedIn RSS feed