Tuesday, January 08, 2008

Bradman v Gretzky v Orr

Yesterday I read Charles Davis' book The Best of the Best. Overall this is an excellent statistical study of cricket and cricketers through Test history. But here I want to talk about one of the later chapters, in which he compares Don Bradman to greats from other sports.

The technique used to compare players across sports is to find a suitable quantity to measure for each player, so that the resulting distribution for all players becomes a bell curve, at least in the high tail. From this, you can compute each player's z-score (z = (x-µ)/σ, where µ is the mean, and σ is the standard deviation), which is directly comparable across different sports.

Davis' analysis of cricketers gives Bradman a z-score of 5.0 when considering batsmen only, and 4.4 when combining batting, bowling, and fielding. Keep in mind the batsmen-only score here, because that will be a fairer comparison to the ice hockey players. It's worth pointing out, for those unfamiliar with statistics, that a z-score of 5 is truly phenomenal — only one player in almost 3.5 million should be that good compared to all other players of the sport. That's 3.5 million Test cricketers, in this case, not 3.5 million members of the general public. There have only been about 2500 Test cricketers, so for Bradman to have existed makes us very lucky.

Davis' analysis of other sports was not as detailed as for cricket, but the results are reasonably persuasive. Pele is the closest to Bradman, with a z-score of 3.7 for goals per international game. Ty Cobb's baseball batting average turns into a z-score of 3.6. Though these numbers might not look so far away from Bradman's 4.4 or 5, you have to remember that larger z-scores become much, much rarer — Pele's 3.7 makes him a 1 in 14000 player.

Unfortunately, Davis neglected ice hockey, even as a major international sport. If cricket is to be counted as an international sport, then so should ice hockey. Most international cricket is sustained by relatively small population bases. Ice hockey's international reach is similar to cricket's. Wikipedia tells me that "most" of the World Championship medals have gone to Canada, the Czech Republic, Finland, Russia, Slovakia, Sweden, and the United States. That's seven countries, a similar number to cricket.

I am particularly interested in hockey here because it is the only major sport I know of to have a player who dominated statistically in a similar way to Bradman. Wayne Gretzky scored 3239 points (that is, goals and assists) in the NHL (including both regular season and play-offs). The next highest point scorers in NHL history are Mark Messier with 2182 and Gordie Howe with 2010. So I decided to do a similar analysis to Davis' for hockey. Bear in mind that this is a rough job done in a few hours and suitable for a blog post, rather than something a bit more careful suitable for ink and paper.

(A disclaimer: While I like watching hockey, I don't have a deep knowledge of the game. Feel free to correct anything I get wrong.)

Using career points, rather than points per game, is common for two main reasons. Firstly, the number of games played by the best players has stayed relatively constant in the last 60 years (certainly compared to cricket!), so comparisons between eras are meaningful. Gordie Howe, who played in the NHL from 1946 to 1971, has the record for the most NHL games with 1767. Second is Messier (1756), who retired in 2004.

Secondly, in a rough sport where players can play over 70 games a season, longevity is a key ingredient in greatness. Nevertheless, it wasn't Mario Lemieux's fault that he got cancer and had various other injuries, so I have considered points per game later as well.

So, onto the analysis. I downloaded the data on NHL players from The Internet Hockey Database. I deleted four players whose numbers didn't tally, and then used only the players classified as forwards by the Hockey Database. This left 3526 players.

I then binned the career points, and eyeball-fitted a normal distribution to the high tail. I wasn't entirely sure what the best approach was here — I had two free parameters to work with (mean and standard deviation), and so I didn't know the best way to do a least squares in this situation. I'm not a statistician by training. Anyway, the fit at the high end is reasonable — it looks comparable to the fits in The Best of the Best — so we can at least get suggestive results. Here's the graph:



The black point over on the far-right is Gretzky. The fit parameters were µ = 0, σ = 700. Using these, Gretzky gets a z-score of 4.6, making him a 1 in 470 000 player. But that score is a little fuzzy, given the way I derived it.

The second-highest point scorer, Messier, gets a z-score of 3.1, comparable with the non-Bradman greats in other sports.

Now onto points per game. Here Gretzky is still the all-time leader, at 1.91 ppg, but he's only just ahead of Lemieux (1.85). Lemieux only played 1022 games to Gretzky's 1695. For the following graph, I deleted players with less than 10 games.



Once again it's been eyeball-fitted, this time with µ = 0.2 and σ = 0.35. This gives Gretzky a z-score of 4.9, Lemieux 4.7, and Gordie Howe 3.6.

Once again, the numbers are fuzzy, but strongly suggestive that Gretzky and Lemieux are up there with Bradman.

This is all well and good, but Mark, at least, would still say that Bobby Orr was better. I don't really know how to measure defencemen in terms of how well they defend, but I can see how many points they scored, and here Orr does fantastically in points per game. He averaged 1.38 ppg, easily the highest for any defenceman (second, and the only other defenceman above 1, is Paul Coffey at 1.08). Here's the ppg graph for the 1420 defencemen with at least 10 NHL games:



The fit parameters are µ = 0.12 and σ = 0.23. Again the numbers are fuzzy — how much of the tail do you fit? The z-score for Orr will be huge regardless. Here it is a whopping 5.5, though with a less generous fit for him it can be closer to 4.6.

There's certainly more room for analysis, particularly in terms of adjusting for eras. Some of Gretzky's early years were very high-scoring in general, for instance.

Whatever the case, I think these numbers are suggestive that Gretzky and Orr were about as great in what they did as Bradman was.

Comments:
They don't call Gretzky "the Great One" for no reason!

Have you considered doing the same approach for forwards only?

Also, the 80s (when Gretzky and Messier scored the lions' share of their points) are notorious for being the most prolific offensive era in NHL history. Roughly speaking, by decade the total goals per game look like:

2000s: 5.5
1990s: 6 (with a huge decline from the first part of the decade to the latter part of the decade)
1980s: 7.5
1970s: 6.5
1960s: 6
1950s: 5.5
1940s: 6 (6.5 if you count the war years)
1930s: 5

If you were able to "deflate" points by league average, you'd probably see Gretzky fall back to earth a bit.
 
Thanks MD. Now that someone who knows ice hockey has actually read this, I'll probably go back and adjust for eras (it might be pretty crude... I don't want to download too much hockey data). Thanks for giving me the rough figures, so I know what the adjustments basically are.

It'll be a couple of weeks before I get round to it though, I'm taking off for a holiday tomorrow!

The analyses I did were either for forwards only or for defencemen only.
 
Just found your site, have had a few ideas on a couple cricket sabermetric papers in the past, so I'm intrigued ;)

Anyhow, here it is, season by season, in case you want to be a bit more precise (the averages I gave you before were just eyeballed):

Regular Season GPG
2007-2008 5.61
2006-2007 5.89
2005-2006 6.17
2003-2004 5.14
2002-2003 5.31
2001-2002 5.24
2000-2001 5.51
1999-2000 5.49
1998-1999 5.27
1997-1998 5.28
1996-1997 5.83
1995-1996 6.29
1994-1995 5.97
1993-1994 6.48
1992-1993 7.25
1991-1992 6.96
1990-1991 6.91
1989-1990 7.37
1988-1989 7.48
1987-1988 7.43
1986-1987 7.34
1985-1986 7.94
1984-1985 7.77
1983-1984 7.89
1982-1983 7.73
1981-1982 8.03
1980-1981 7.69
1979-1980 7.03
1978-1979 7.00
1977-1978 6.59
1976-1977 6.64
1975-1976 6.82
1974-1975 6.85
1973-1974 6.39
1972-1973 6.55
1971-1972 6.13
1970-1971 6.24
1969-1970 5.81
1968-1969 5.96
1967-1968 5.58
1966-1967 5.96
1965-1966 6.08
1964-1965 5.75
1963-1964 5.55
1962-1963 5.95
1961-1962 6.02
1960-1961 6.00
1959-1960 5.90
1958-1959 5.80
1957-1958 5.60
1956-1957 5.38
1955-1956 5.07
1954-1955 5.04
1953-1954 4.80
1952-1953 4.79
1951-1952 5.19
1950-1951 5.42
1949-1950 5.47
1948-1949 5.43
1947-1948 5.86
1946-1947 6.32
1945-1946 6.69
1944-1945 7.35
1943-1944 8.17
1942-1943 7.22
1941-1942 6.23
1940-1941 5.36
1939-1940 4.99
1938-1939 5.07
1937-1938 5.06
1936-1937 4.93
1935-1936 4.33
 
But were the 80s averages to the era or the fact the players were so much better? Chicken and the egg. May be best to compare top 10 scorers in the league each year and for bradman, only look at top 100 batsmen of all time (not sure if they did this already). Or take top 100 in hockey and top 100 in cricket (point scorers for nhl and runs scored in cricket test matches) I know cricket and hockey (from australia, now Socal - LA Kings fan since 2000) and my gut says bradman and gretzky should be pretty much even.
 
Post a Comment

Subscribe to Post Comments [Atom]





<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]