Statistics 101: The More Things Change…

Photo by Cookie the Pom at Unsplash.com

As one who loves statistics, it drives me nuts to see them abused — which, at the racetrack, is about as common as swearing in a Martin Scorsese film. Handicappers screaming “track bias” after two races, the notion (despite zero proof) that the number one post position is a detriment in the Kentucky Derby, and — my favorite — the idea that there is “live” and/or “dead” money in the betting pools that board-watchers can capitalize on.

First of all, while statistics do a great job of summarizing what was, they must be viewed in the context of how they were accumulated to truly understand their significance.

For example, in 2007, there were 20 .300 hitters in the American League that met the League minimum of 3.1 plate appearances per game — five fewer than in 1927 when there were 25.

Not a huge discrepancy, right?

Well, if we dig a little deeper, we find that, in 1927, AL batters recorded a total of 42,117 plate appearances — less than half of the 87,816 plate appearances recorded by AL hitters in 2007. Add to this the “color line” (which Jackie Robinson famously broke in 1946), and it becomes abundantly clear that the talent pool was much thinner in 1927 than in 2007, making the top hitters of 1927 — Harry Heilman (.398), Lou Gehrig (.373) and Bob Fothergill (.359) — uncomparable with the top batsmen of 2007 — Magglio Ordonez (.363), Ichiro Suzuki (.351) and Placido Polanco (.341).

Likewise, due to a plethora of rules and schematic changes in the NFL, it is impossible to use the League’s official passer rating to assess the merits of old-time quarterbacks. Three-time Super Bowl MVP Joe Montana is widely considered one of the greatest signal-callers of all time, yet his career passer rating of 92.3 ranks 10th, below — are you ready for this? — Jimmy Garoppolo, among others.

From a horse racing perspective, this means that statistics and handicapping theories from yesteryear must be reevaluated in light of how the game has changed over the years. For instance, an article by William McGlothlin that appeared in the American Journal of Psychology demonstrated that the favorite in the last race of the day is likely to be underbet. Unfortunately, McGlothin collected his data from 1947 to 1953 — long before simulcasting, home computers, and online betting — and his conclusion, which became a racetrack truism, simply doesn’t hold up today, as I proved in a study of my own.

Similarly, the antiquated notion that one can profit from detecting “smart money” in the wagering pools is also fatally flawed. In a 1988 piece published by William T. Ziemba and Richard H. Thaler in the Journal of Economic Perspectives, the authors referenced an academic study conducted by Asch, Malkiel, and Quandt (1984, 1986) investigating whether a late drop in odds reflected insider betting.

Asch, Malkiel, and Quandt discovered “that for winning horses the final odds tend to be lower than the ‘morning line odds’ (predicted odds by the track handicapper), whereas, for horses finishing out of the money, the final odds are much higher than the morning line odds.”

But, alas, that was before the influx of commingled pools and computer-generated wagering, which have all but wiped out what was a small edge to begin with (and only to place and show). Furthermore, assuming that pari-mutuel markets are efficient — and, for the most part, they are — the fact that morning-line underlays win more often than morning-line overlays is perfectly consistent:

HORSES WITH FINAL ODDS GREATER THAN THEIR MORNING-LINE ODDS

Number: 70,953
Winners: 4,828
Rate: 6.8%
IV: 0.55
ROI: -29.8%

HORSES WITH FINAL ODDS LESS THAN THEIR MORNING-LINE ODDS

Number: 42,367
Winners: 9,601
Rate: 22.7%
IV: 1.73
ROI: -18.73%

The bottom line here is not that I’m a genius and the betting crowd is comprised of dunderheads. As much as I wish that were true, it’s not. The lesson here is that bettors should follow their own instincts and not solely rely on money movements and other simplistic angles that may or, more likely, may not prove to be meaningful.

Author: DDS