Saturday, March 01, 2014

Highest scores

Recently I got thinking about Flappy Bird scores, and noted that they should be similar to cricket scores in the way they're distributed.

Translating the problem to cricket, the question is, "What is the relationship between a batsman's highest score, his average, and the number of innings he's batted in?" A follow-up question is how well such a formula predicts, say, the average based on the innings and highest score. I expected there to be quite a lot of scatter (assessing a batsman on just one innings!), but the correlation ended up being pretty decent. This isn't a particularly useful correlation, but I found it fun to play with.

Given the average, the probability that a batsman makes a score less than x is roughly (1 - exp(-x/avg)). Treating each innings as independent of the others, the probability that all N of a batsman's innings are less than x is therefore (1 - exp(-x/avg))N.

A natural "expected" highest score, given the batsman's average, would be the value x such that the probability that all innings are less than x is 1/2. Calling this value HS, we have

(1 - exp(-HS/avg))N = 1/2,

which is the basic relation between highest score, average, and number of innings batted.

Solving for HS (i.e., what highest score do we expect, given average and innings batted?):

HS = -avg * ln(1 - 0.51/N).

Solving for avg (given only highest score and innings batted, what do we think the batsman's average is?):

avg = -HS / ln(1 - 0.51/N).

Using the latter formula as a predictor works surprisingly (to me) well:

The weird qualification of 50 dismissals rather than innings is because I began by including all abtsmen with at least 2 dismissals, and at sample sizes this small, I figured dismissals would be useful to avoid not-out irregularities. (The R-squared is actually a bit higher, about 0.78, in that expanded sample.)

I doubt I'll ever use this formula, but Perhaps the most amusing individual result is Shane Warne: based on his average and innings batted, his "predicted" high score is... 98! So he didn't quite deserve that hundred after all.

Full table, straight off Statsguru – I was so lazy that I didn't even turn off the ICC game.

                                                              predict one based
                                                    real        on the other
Player                         Inns      NO      HS     avg      HS     avg
DG Bradman (Aus)                 80      10     334    99.9   475.0    70.3
H Sutcliffe (Eng)                84       9     194    60.7   291.6    40.4
KF Barrington (Eng)             131      15     256    58.7   307.7    48.8
ED Weekes (WI)                   81       5     207    58.6   279.3    43.4
WR Hammond (Eng)                140      16     336    58.5   310.4    63.3
KC Sangakkara (SL)              209      17     319    58.1   331.6    55.9
GS Sobers (WI)                  160      21     365    57.8   314.5    67.0
JB Hobbs (Eng)                  102       7     211    56.9   284.4    42.2
CL Walcott (WI)                  74       7     220    56.7   265.0    47.1
L Hutton (Eng)                  138      15     364    56.7   300.1    68.7
JH Kallis (ICC/SA)              280      40     224    55.4   332.4    37.3
GS Chappell (Aus)               151      19     247    53.9   290.1    45.9
AD Nourse (SA)                   62       7     231    53.8   242.1    51.3
SR Tendulkar (India)            329      33     248    53.8   331.5    40.2
BC Lara (ICC/WI)                232       6     400    52.9   307.5    68.8
Javed Miandad (Pak)             189      21     280    52.6   294.9    49.9
R Dravid (ICC/India)            286      32     270    52.3   315.1    44.8
Mohammad Yousuf (Pak)           156      12     223    52.3   283.3    41.2
AB de Villiers (SA)             152      16     278    52.3   281.9    51.6
S Chanderpaul (WI)              261      45     203    51.9   308.1    34.2
RT Ponting (Aus)                287      29     257    51.9   312.5    42.6
HM Amla (SA)                    130      11     311    51.6   270.0    59.4
A Flower (Zim)                  112      19     232    51.5   262.2    45.6
MEK Hussey (Aus)                137      16     195    51.5   272.5    36.9
Younis Khan (Pak)               158      14     313    51.4   279.1    57.6
SM Gavaskar (India)             214      16     236    51.1   293.1    41.2
SR Waugh (Aus)                  260      46     200    51.1   302.7    33.7
MJ Clarke (Aus)                 178      19     329    50.8   282.0    59.3
ML Hayden (Aus)                 184      14     380    50.7   283.2    68.1
AR Border (Aus)                 265      44     205    50.6   300.7    34.5
DPMD Jayawardene (SL)           240      15     374    50.3   294.2    63.9
IVA Richards (WI)               182      12     291    50.2   279.9    52.2
DCS Compton (Eng)               131      15     278    50.1   262.5    53.0
Inzamam-ul-Haq (ICC/Pak)        200      22     329    49.6   281.1    58.1
FMM Worrell (WI)                 87       9     261    49.5   239.3    54.0
V Sehwag (ICC/India)            180       6     319    49.3   274.4    57.4
B Mitchell (SA)                  80       9     189    48.9   232.3    39.8
TT Samaraweera (SL)             132      20     231    48.8   256.1    44.0
Misbah-ul-Haq (Pak)              80      14     161    48.8   231.7    33.9
GC Smith (ICC/SA)               203      13     277    48.7   276.8    48.8
RN Harvey (Aus)                 137      10     205    48.4   256.0    38.8
KD Walters (Aus)                125      14     250    48.3   250.8    48.1
SJ McCabe (Aus)                  62       5     232    48.2   216.9    51.6
ER Dexter (Eng)                 102       8     205    47.9   239.2    41.0
G Boycott (Eng)                 193      23     246    47.7   268.7    43.7
VS Hazare (India)                52       6     164    47.7   206.1    37.9
EH Hendren (Eng)                 83       9     205    47.6   228.1    42.8
AC Gilchrist (Aus)              137      20     204    47.6   251.8    38.6
SM Nurse (WI)                    54       1     258    47.6   207.6    59.1
RB Kanhai (WI)                  137       6     256    47.5   251.4    48.4
KP Pietersen (Eng)              181       8     227    47.3   263.2    40.8
WM Lawry (Aus)                  123      12     210    47.2   244.3    40.5
LRPL Taylor (NZ)                 98       9     217    46.9   232.6    43.8
RB Simpson (Aus)                111       7     311    46.8   237.8    61.2
PBH May (Eng)                   106       9     285    46.8   235.4    56.6
CH Lloyd (WI)                   175      14     242    46.7   258.2    43.7
AL Hassett (Aus)                 69       3     198    46.6   214.4    43.0
DM Jones (Aus)                   89      11     216    46.6   226.2    44.5
AN Cook (Eng)                   183      10     294    46.5   259.4    52.7
AR Morris (Aus)                  79       3     206    46.5   220.3    43.5
IJL Trott (Eng)                  87       6     226    46.5   224.7    46.7
DR Martyn (Aus)                 109      14     165    46.4   234.7    32.6
DL Amiss (Eng)                   88      10     262    46.3   224.5    54.0
AD Mathews (SL)                  62      12     157    46.2   207.7    34.9
M Leyland (Eng)                  65       5     187    46.1   209.4    41.1
WM Woodfull (Aus)                54       4     161    46.0   200.6    36.9
VVS Laxman (India)              225      34     281    46.0   265.9    48.6
EJ Barlow (SA)                   57       2     201    45.7   202.0    45.5
NC O'Neill (Aus)                 69       8     181    45.6   209.8    39.3
Saeed Anwar (Pak)                91       2     188    45.5   222.2    38.5
IR Bell (Eng)                   170      22     235    45.4   250.0    42.7
MD Crowe (NZ)                   131      11     299    45.4   237.9    57.0
JL Langer (Aus)                 182      12     250    45.3   252.3    44.9
G Kirsten (SA)                  176      15     275    45.3   250.7    49.6
CC Hunte (WI)                    78       6     260    45.1   213.0    55.0
SM Katich (Aus)                  99       6     157    45.0   223.6    31.6
M Azharuddin (India)            147       9     199    45.0   241.3    37.1
Zaheer Abbas (Pak)              124      11     274    44.8   232.4    52.8
MH Richardson (NZ)               65       3     145    44.8   203.5    31.9
CG Greenidge (WI)               185      16     226    44.7   249.9    40.4
GP Thorpe (Eng)                 179      28     200    44.7   248.1    36.0
GM Turner (NZ)                   73       6     259    44.6   208.1    55.6
AI Kallicharran (WI)            109      10     187    44.4   224.9    36.9
RB Richardson (WI)              146      12     194    44.4   237.6    36.2
TW Graveney (Eng)               123      13     258    44.4   230.0    49.8
Shoaib Mohammad (Pak)            68       7     203    44.3   203.6    44.2
AH Jones (NZ)                    74       8     186    44.3   207.0    39.8
DI Gower (Eng)                  204      18     215    44.3   251.6    37.8
DJ Cullinan (SA)                115      12     275    44.2   226.1    53.8
G Gambhir (India)                96       5     206    44.2   218.0    41.7
MC Cowdrey (Eng)                188      15     182    44.1   246.9    32.5
Hanif Mohammad (Pak)             97       8     337    44.0   217.5    68.2
ME Trescothick (Eng)            143      10     219    43.8   233.5    41.1
Saleem Malik (Pak)              154      22     237    43.7   236.2    43.8
RA Smith (Eng)                  112      15     175    43.7   222.2    34.4
EAB Rowan (SA)                   50       5     236    43.7   187.1    55.1
DC Boon (Aus)                   190      20     200    43.7   245.1    35.6
JH Edrich (Eng)                 127       9     310    43.5   227.0    59.5
MA Taylor (Aus)                 186      13     334    43.5   243.3    59.7
IR Redpath (Aus)                120      11     171    43.5   224.1    33.2
BF Butcher (WI)                  78       6     209    43.1   203.8    44.2
PA de Silva (SL)                159      11     267    43.0   233.7    49.1
DA Warner (Aus)                  54       3     180    42.9   187.0    41.3
HP Tillakaratne (SL)            131      25     204    42.9   224.8    38.9
MJ Slater (Aus)                 131       7     219    42.8   224.6    41.8
C Washbrook (Eng)                66       6     195    42.8   195.3    42.7
GA Gooch (Eng)                  215       6     333    42.6   244.4    58.0
M Amarnath (India)              113      10     138    42.5   216.6    27.1
RC Fredericks (WI)              109       7     169    42.5   215.0    33.4
IM Chappell (Aus)               136      10     196    42.4   224.1    37.1
JB Stollmeyer (WI)               56       5     160    42.3   186.2    36.4
DL Haynes (WI)                  202      25     184    42.3   240.1    32.4
PR Umrigar (India)               94       8     223    42.2   207.4    45.4
SC Ganguly (India)              188      17     239    42.2   236.4    42.6
DB Vengsarkar (India)           185      22     166    42.1   235.5    29.7
NS Sidhu (India)                 78       2     201    42.1   199.2    42.5
DJ McGlew (SA)                   64       6     255    42.1   190.6    56.3
CH Gayle (WI)                   174       9     333    42.0   232.2    60.2
HH Gibbs (SA)                   154       7     228    42.0   226.8    42.2
GR Viswanath (India)            155      10     222    41.9   226.9    41.0
ME Waugh (Aus)                  209      17     153    41.8   238.8    26.8
CG Macartney (Aus)               55       4     170    41.8   183.0    38.8
AG Prince (SA)                  104      16     162    41.6   208.8    32.3
MP Vaughan (Eng)                147       9     197    41.4   222.1    36.8
JC Adams (WI)                    90      17     208    41.3   200.9    42.7
GN Yallop (Aus)                  70       3     268    41.1   190.0    58.0
GRJ Matthews (Aus)               53       8     130    41.1   178.4    29.9
KC Wessels (Aus/SA)              71       3     179    41.0   190.0    38.6
TM Dilshan (SL)                 145      11     193    41.0   219.1    36.1
AJ Strauss (Eng)                178       6     177    40.9   227.1    31.9
PH Parfitt (Eng)                 52       6     131    40.9   176.9    30.3
MJ Prior (Eng)                  116      20     131    40.8   209.2    25.6
HW Taylor (SA)                   76       4     176    40.8   191.7    37.4
LEG Ames (Eng)                   72      12     149    40.6   188.5    32.1
PD Collingwood (Eng)            115      10     206    40.6   207.4    40.3
W Bardsley (Aus)                 66       5     193    40.5   184.6    42.3
AW Greig (Eng)                   93       4     148    40.4   198.2    30.2
Saeed Ahmed (Pak)                78       4     172    40.4   191.0    36.4
B Sutcliffe (NZ)                 76       8     230    40.1   188.5    48.9
ST Jayasuriya (SL)              188      14     340    40.1   224.6    60.7
BL D'Oliveira (Eng)              70       8     158    40.1   185.1    34.2
SP Fleming (NZ)                 189      10     274    40.1   224.7    48.8
RR Sarwan (WI)                  154       8     291    40.0   216.3    53.8
WJ Edrich (Eng)                  63       2     219    40.0   180.6    48.5
KWR Fletcher (Eng)               96      14     216    39.9   196.9    43.8
HA Gomes (WI)                    91      11     143    39.6   193.4    29.3
AJ Stewart (Eng)                235      21     190    39.5   230.4    32.6
BM McMillan (SA)                 62      12     113    39.4   177.1    25.1
CC McDonald (Aus)                83       4     170    39.3   188.3    35.5
DN Sardesai (India)              55       4     212    39.2   171.8    48.4
C Hill (Aus)                     89       2     191    39.2   190.5    39.3
Mushtaq Mohammad (Pak)          100       7     201    39.2   194.9    40.4
Azhar Ali (Pak)                  60       4     157    39.1   174.8    35.1
VL Manjrekar (India)             92      10     189    39.1   191.4    38.6
VT Trumper (Aus)                 89       8     214    39.0   189.7    44.0
MS Atapattu (SL)                156      15     249    39.0   211.4    46.0
AP Gurusinha (SL)                70       7     143    38.9   179.8    31.0
Majid Khan (Pak)                106       5     167    38.9   195.9    33.2
Asif Iqbal (Pak)                 99       7     175    38.9   192.9    35.2
MS Dhoni (India)                130      15     224    38.8   203.0    42.8
Taufeeq Umar (Pak)               81       5     236    38.7   184.5    49.5
WW Armstrong (Aus)               84      10     159    38.7   185.7    33.1
CD McMillan (NZ)                 91      10     142    38.5   187.7    29.1
PJP Burge (Aus)                  68       8     181    38.2   175.2    39.4
Mudassar Nazar (Pak)            116       8     231    38.1   195.1    45.1
BB McCullum (NZ)                145       8     302    38.1   203.6    56.5
Shakib Al Hasan (Ban)            65       5     144    38.0   172.6    31.7
JG Wright (NZ)                  148       7     185    37.8   202.9    34.5
Imran Khan (Pak)                126      25     136    37.7   196.2    26.1
MA Atherton (Eng)               212       7     185    37.7   215.8    32.3
Ijaz Ahmed (Pak)                 92       4     211    37.7   184.3    43.1
JV Coney (NZ)                    85      14     174    37.6   180.8    36.2
PE Richardson (Eng)              56       1     126    37.5   164.8    28.6
KR Stackpole (Aus)               80       5     207    37.4   177.9    43.6
KJ Hughes (Aus)                 124       6     213    37.4   194.1    41.0
ND McKenzie (SA)                 94       7     226    37.4   183.7    46.0
AN Petersen (SA)                 52       3     182    37.3   161.3    42.1
N Hussain (Eng)                 171      16     207    37.2   204.9    37.6
SV Manjrekar (India)             61       6     218    37.1   166.5    48.6
Mohsin Khan (Pak)                79       6     200    37.1   175.9    42.2
NJ Astle (NZ)                   137      10     222    37.0   195.8    42.0
KR Miller (Aus)                  87       7     147    37.0   178.8    30.4
Tamim Iqbal (Ban)                62       0     151    36.6   164.6    33.6
CL Hooper (WI)                  173      15     233    36.5   201.3    42.2
WJ Cronje (SA)                  111       9     135    36.4   184.9    26.6
KS Williamson (NZ)               56       2     135    36.4   160.0    30.7
SR Watson (Aus)                  95       3     176    36.3   178.9    35.7
JDP Oram (NZ)                    59      10     133    36.3   161.6    29.9
Wasim Raja (Pak)                 92      14     125    36.2   176.9    25.6
AJ Lamb (Eng)                   139      10     142    36.1   191.4    26.8
FE Woolley (Eng)                 98       7     154    36.1   178.7    31.1
Sadiq Mohammad (Pak)             74       2     166    35.8   167.4    35.5
AL Logie (WI)                    78       9     130    35.8   169.2    27.5
RJ Shastri (India)              121      14     206    35.8   184.9    39.9
A Ranatunga (SL)                155      12     135    35.7   193.2    24.9
JN Rhodes (SA)                   80       9     117    35.7   169.5    24.6
CG Borde (India)                 97      11     177    35.6   176.0    35.8
MW Gatting (Eng)                138      14     207    35.6   188.3    39.1
MN Samuels (WI)                  90       6     260    35.5   172.9    53.4
BJ Haddin (Aus)                  94       9     169    35.5   174.4    34.4
JA Rudolph (SA)                  83       9     222    35.4   169.7    46.4
Aamer Sohail (Pak)               83       3     205    35.3   169.0    42.8
GM Ritchie (Aus)                 53       5     146    35.2   152.9    33.6
MAK Pataudi (India)              83       3     203    34.9   167.2    42.4
JP Crawley (Eng)                 61       9     156    34.6   155.2    34.8
MA Butcher (Eng)                131       7     173    34.6   181.3    33.0
TL Goddard (SA)                  78       5     112    34.5   162.9    23.7
TW Hayward (Eng)                 60       2     137    34.5   153.9    30.7
W Jaffer (India)                 58       1     212    34.1   151.2    47.8
GS Blewett (Aus)                 79       4     214    34.0   161.3    45.1
Mohammad Hafeez (Pak)            70       6     196    34.0   156.9    42.4
WR Endean (SA)                   52       4     162    34.0   146.8    37.5
Yuvraj Singh (India)             62       6     169    33.9   152.6    37.6
AP Sheahan (Aus)                 53       6     127    33.9   147.3    29.2
AC MacLaren (Eng)                61       4     140    33.9   151.8    31.2
IT Botham (Eng)                 161       6     208    33.5   182.8    38.2
CL Cairns (NZ)                  104       5     158    33.5   168.1    31.5
KD Mackay (Aus)                  52       7      89    33.5   144.8    20.6
Yashpal Sharma (India)           59      11     140    33.5   148.8    31.5
Shoaib Malik (Pak)               54       6     148    33.5   145.9    33.9
AC Hudson (SA)                   63       3     163    33.5   151.0    36.1
DW Randall (Eng)                 79       5     174    33.4   158.2    36.7
JR Reid (NZ)                    108       5     142    33.3   168.1    28.1
GR Marsh (Aus)                   93       7     138    33.2   162.7    28.1
WW Hinds (WI)                    80       1     213    33.0   156.9    44.8
L Klusener (SA)                  69      11     174    32.9   151.3    37.8
APE Knott (Eng)                 149      15     135    32.8   176.0    25.1
M Prabhakar (India)              58       9     120    32.7   144.7    27.1
NT Paranavitana (SL)             60       5     111    32.6   145.5    24.9
P Roy (India)                    79       4     173    32.6   154.3    36.5
CJ Tavare (Eng)                  56       2     149    32.5   142.9    33.9
GP Howarth (NZ)                  83       5     147    32.4   155.4    30.7
Mushfiqur Rahim (Ban)            71       4     200    32.4   150.3    43.2
SL Campbell (WI)                 93       4     208    32.4   158.8    42.4
SM Pollock (SA)                 156      39     111    32.3   175.1    20.5
BE Congdon (NZ)                 114       7     176    32.2   164.5    34.5
JM Parks (Eng)                   68       7     108    32.2   147.7    23.5
MS Sinclair (NZ)                 56       5     214    32.1   141.0    48.7
Imran Farhat (Pak)               77       2     128    32.0   150.9    27.1
PJL Dujon (WI)                  115      11     139    31.9   163.4    27.2
Rameez Raja (Pak)                94       5     122    31.8   156.4    24.8
GM Wood (Aus)                   112       6     172    31.8   162.0    33.8
BA Young (NZ)                    68       4     267    31.8   145.9    58.2
A Flintoff (Eng/ICC)            130       9     167    31.8   166.4    31.9
RES Wyatt (Eng)                  64       6     149    31.7   143.6    32.9
MJK Smith (Eng)                  78       6     121    31.6   149.5    25.6
NJ Contractor (India)            52       1     108    31.6   136.6    25.0
CPS Chauhan (India)              68       2      97    31.6   144.9    21.1
MH Mankad (India)                72       5     231    31.5   146.3    49.7
DJ Bravo (WI)                    71       1     113    31.4   145.6    24.4
GA Hick (Eng)                   114       6     178    31.3   159.9    34.9
HAPW Jayawardene (SL)            77      11     154    31.2   147.2    32.7
MG Burgess (NZ)                  92       6     119    31.2   152.6    24.3
GT Dowling (NZ)                  77       3     239    31.2   146.9    50.7
FM Engineer (India)              87       3     121    31.1   150.3    25.0
AL Wadekar (India)               71       3     143    31.1   144.0    30.9
N Kapil Dev (India)             184      15     163    31.1   173.4    29.2
Habibul Bashar (Ban)             99       1     113    30.9   153.3    22.8
Kamran Akmal (Pak)               92       6     158    30.8   150.6    32.3
JT Tyldesley (Eng)               55       1     138    30.8   134.7    31.5
KLT Arthurton (WI)               50       5     157    30.7   131.6    36.6
ML Jaisimha (India)              71       4     129    30.7   142.2    27.8
MJ Greatbatch (NZ)               71       5     146    30.6   141.9    31.5
BA Edgar (NZ)                    68       4     161    30.6   140.4    35.1
Salman Butt (Pak)                62       0     122    30.5   137.0    27.1
JHB Waite (SA)                   86       7     134    30.4   146.9    27.8
T Taibu (Zim)                    54       3     153    30.3   132.2    35.1
MV Boucher (ICC/SA)             206      24     125    30.3   172.6    21.9
RA McLean (SA)                   73       3     142    30.3   141.2    30.5
MA Noble (Aus)                   73       7     133    30.3   141.0    28.5
BF Hastings (NZ)                 56       6     117    30.2   132.8    26.6
W Rhodes (Eng)                   98      21     179    30.2   149.6    36.1
HH Dippenaar (SA)                62       5     177    30.1   135.6    39.3
DL Vettori (ICC/NZ)             173      23     140    30.1   166.2    25.4
AD Gaekwad (India)               70       4     201    30.1   138.9    43.5
K Srikkanth (India)              72       3     123    29.9   138.9    26.5
AW Nourse (SA)                   83       8     111    29.8   142.6    23.2
TE Bailey (Eng)                  91      14     134    29.7   145.2    27.5
MJ Guptill (NZ)                  59       1     189    29.6   131.8    42.5
GW Flower (Zim)                 123       6     201    29.5   153.1    38.8
GJ Whittall (Zim)                82       7     203    29.4   140.6    42.5
Imtiaz Ahmed (Pak)               72       1     209    29.3   136.1    45.0
RS Mahanama (SL)                 89       1     225    29.3   142.2    46.3
Rashid Latif (Pak)               57       9     150    28.8   127.0    34.0
Abdul Razzaq (Pak)               77       9     134    28.6   134.9    28.4
J Darling (Aus)                  60       2     178    28.6   127.6    39.9
Moin Khan (Pak)                 104       8     137    28.6   143.2    27.3
KG Viljoen (SA)                  50       2     124    28.4   121.8    28.9
MJ Horne (NZ)                    65       2     157    28.4   129.0    34.5
RD Jacobs (WI)                  112      21     118    28.3   144.0    23.2
RP Arnold (SL)                   69       4     123    28.0   129.0    26.7
IA Healy (Aus)                  182      23     161    27.4   152.6    28.9
MR Ramprakash (Eng)              92       6     154    27.3   133.7    31.5
D Ramdin (WI)                    95      13     166    27.3   134.2    33.7
ADR Campbell (Zim)              109       4     103    27.2   137.7    20.4
Sir RJ Hadlee (NZ)              134      19     151    27.2   143.1    28.7
RC Russell (Eng)                 86      16     128    27.1   130.8    26.5
KR Rutherford (NZ)               99       8     107    27.1   134.5    21.6
SMH Kirmani (India)             124      22     102    27.0   140.3    19.7
SV Carlisle (Zim)                66       6     118    26.9   122.7    25.9
H Masakadza (Zim)                50       2     119    26.9   115.3    27.8
P Willey (Eng)                   50       6     102    26.9   115.3    23.8
J Dyson (Aus)                    58       7     127    26.6   118.1    28.6
PR Reiffel (Aus)                 50      14      79    26.5   113.7    18.4
RW Marsh (Aus)                  150      13     132    26.5   142.6    24.5
AC Parore (NZ)                  128      19     110    26.3   137.2    21.1
JJ Crowe (NZ)                    65       4     128    26.2   119.3    28.2
RS Kaluwitharana (SL)            78       4     132    26.1   123.5    27.9
G Miller (Eng)                   51       4      98    25.8   111.1    22.8
D Ganga (WI)                     86       2     135    25.7   124.0    28.0
KS More (India)                  64      14      73    25.7   116.4    16.1
RG Nadkarni (India)              67      12     122    25.7   117.6    26.7
IDS Smith (NZ)                   88      17     173    25.6   123.9    35.7
DA Allen (Eng)                   51      15      88    25.5   109.8    20.4
MW Tate (Eng)                    52       5     100    25.5   110.2    23.1
N Boje (SA)                      62      10      85    25.2   113.5    18.9
SA Durani (India)                50       2     104    25.0   107.3    24.3
DS Smith (WI)                    58       2     108    24.7   109.5    24.4
AK Davidson (Aus)                61       7      80    24.6   110.2    17.8
GS Ramchand (India)              53       5     109    24.6   106.8    25.1
JM Parker (NZ)                   63       2     121    24.6   110.8    26.8
SE Gregory (Aus)                100       7     201    24.5   122.0    40.4
C White (Eng)                    50       7     121    24.5   104.8    28.2
R Benaud (Aus)                   97       7     122    24.5   120.9    24.7
V Pollard (NZ)                   59       7     116    24.3   108.3    26.1
WPUJC Vaas (SL)                 162      35     100    24.3   132.7    18.3
DJ Richardson (SA)               64       8     109    24.3   109.9    24.1
SCJ Broad (Eng)                  95      12     169    24.2   119.2    34.3
SC Williams (WI)                 52       3     128    24.1   104.4    29.6
NR Mongia (India)                68       8     152    24.0   110.3    33.1
Mohammad Ashraful (Ban)         119       5     190    24.0   123.6    36.9
GO Jones (Eng)                   53       4     100    23.9   103.8    23.0
G Giffen (Aus)                   53       0     161    23.4   101.4    37.1
R Illingworth (Eng)              90      11     113    23.2   113.2    23.2
AC Bannerman (Aus)               50       2      94    23.1    98.9    21.9
CC Lewis (Eng)                   51       3     117    23.0    99.1    27.2
DL Murray (WI)                   96       9      91    22.9   113.0    18.4
JM Brearley (Eng)                66       3      91    22.9   104.4    19.9
DD Ebrahim (Zim)                 55       1      94    22.7    99.3    21.5
WAS Oldfield (Aus)               80      17      65    22.7   107.7    13.7
S Madan Lal (India)              62      16      74    22.7   101.9    16.4
Wasim Akram (Pak)               147      19     257    22.6   121.3    48.0
JE Emburey (Eng)                 96      20      75    22.5   111.2    15.2
MG Johnson (Aus)                 87      14     123    22.4   108.4    25.4
CB Wishart (Zim)                 50       1     114    22.4    96.0    26.6
HH Streak (Zim)                 107      18     127    22.4   112.7    25.2
FJ Titmus (Eng)                  76      11      84    22.3   104.8    17.9
Intikhab Alam (Pak)              77      10     138    22.3   105.0    29.3
GP Swann (Eng)                   76      14      85    22.1   103.9    18.1
Javed Omar (Ban)                 80       2     119    22.1   104.8    25.0
DJG Sammy (WI)                   63       2     106    21.7    97.9    23.5
KJ Wadsworth (NZ)                51       4      80    21.5    92.5    18.6
KD Ghavri (India)                57      14      86    21.2    93.7    19.5
RR Lindwall (Aus)                84      13     118    21.2   101.6    24.6
AF Giles (Eng)                   81      13      59    20.9    99.5    12.4
DN Patel (NZ)                    66       8      99    20.7    94.3    21.7
AFA Lilley (Eng)                 52       8      84    20.5    88.7    19.4
TG Evans (Eng)                  133      14     104    20.5   107.8    19.8
JG Bracewell (NZ)                60      11     110    20.4    91.2    24.6
BR Taylor (NZ)                   50       6     124    20.4    87.4    28.9
S Abid Ali (India)               53       3      81    20.4    88.4    18.6
B Lee (Aus)                      90      18      64    20.2    98.1    13.1
H Trumble (Aus)                  57      14      70    19.8    87.4    15.9
HDPK Dharmasena (SL)             51       7      62    19.7    84.9    14.4
B Yardley (Aus)                  54       4      74    19.6    85.3    17.0
Khaled Mashud (Ban)              84      10     103    19.0    91.4    21.5
TG Southee (NZ)                  50       5      77    19.0    81.4    18.0
MD Marshall (WI)                107      11      92    18.9    95.1    18.2
JN Gillespie (Aus)               93      28     201    18.7    91.8    41.0
Mohammad Rafique (Ban)           63       6     111    18.6    83.8    24.6
IWG Johnson (Aus)                66      12      77    18.5    84.4    16.9
Harbhajan Singh (India)         142      22     115    18.4    97.7    21.6
J Briggs (Eng)                   50       5     121    18.1    77.6    28.2
DG Cork (Eng)                    56       8      59    18.0    79.2    13.4
A Kumble (India)                173      32     110    17.8    98.1    19.9
Sarfraz Nawaz (Pak)              72      13      90    17.7    82.3    19.4
PH Edmonds (Eng)                 65      15      64    17.5    79.6    14.1
SK Warne (Aus)                  199      17      99    17.3    98.1    17.5
JJ Kelly (Aus)                   56      17      46    17.0    74.9    10.5
HJ Tayfield (SA)                 60       9      75    16.9    75.5    16.8
MG Hughes (Aus)                  70       8      72    16.6    76.9    15.6
Nasim-ul-Ghani (Pak)             50       5     101    16.6    71.1    23.6
BL Cairns (NZ)                   65       8      64    16.3    74.0    14.1
RW Taylor (Eng)                  83      12      97    16.3    78.0    20.3
GF Lawson (Aus)                  68      12      74    16.0    73.3    16.1
Wasim Bari (Pak)                112      26      85    15.9    80.8    16.7
WW Hall (WI)                     66      14      50    15.7    71.8    11.0
JM Blackham (Aus)                62      11      74    15.7    70.5    16.4
Abdul Qadir (Pak)                77      11      61    15.6    73.5    12.9
DR Pringle (Eng)                 50       4      63    15.1    64.7    14.7
ATW Grout (Aus)                  67       8      74    15.1    69.0    16.2
AME Roberts (WI)                 62      11      68    14.9    67.2    15.1
CM Old (Eng)                     66       9      65    14.8    67.6    14.2
PAJ DeFreitas (Eng)              68       5      88    14.8    68.0    19.2
SB Doull (NZ)                    50      11      46    14.6    62.6    10.7
Saqlain Mushtaq (Pak)            78      14     101    14.5    68.5    21.4
RO Collinge (NZ)                 50      13      68    14.4    61.7    15.9
DW Steyn (SA)                    89      21      76    14.3    69.2    15.6
PM Siddle (Aus)                  76      11      51    14.2    67.0    10.8
J Srinath (India)                92      21      76    14.2    69.5    15.5
VA Holder (WI)                   59      11      42    14.2    63.2     9.4
Fazal Mahmood (Pak)              50       6      60    14.1    60.4    14.0
JC Laker (Eng)                   63      15      63    14.1    63.6    14.0
CV Grimmett (Aus)                50      10      50    13.9    59.7    11.7
FS Trueman (Eng)                 85      14      39    13.8    66.5     8.1
MA Holding (WI)                  76      10      73    13.8    64.8    15.5
GAR Lock (Eng)                   63       9      89    13.7    62.0    19.7
DK Lillee (Aus)                  90      24      73    13.7    66.8    15.0
JA Snow (Eng)                    71      14      73    13.5    62.7    15.8
GR Dilley (Eng)                  58      19      56    13.4    59.2    12.6
Iqbal Qasim (Pak)                57      15      56    13.1    57.7    12.7
HMRKB Herath (SL)                70      15      80    13.0    59.9    17.3
Mashrafe Mortaza (Ban)           67       5      79    12.9    58.8    17.3
JR Thomson (Aus)                 73      20      49    12.8    59.7    10.5
AV Bedser (Eng)                  71      15      79    12.8    59.1    17.0
D Gough (Eng)                    86      18      65    12.6    60.6    13.5
J Garner (WI)                    68      14      60    12.4    57.1    13.1
CEL Ambrose (WI)                145      29      53    12.4    66.3     9.9
M Morkel (SA)                    65      10      40    12.3    55.8     8.8
GD McKenzie (Aus)                89      12      76    12.3    59.6    15.6
CJ McDermott (Aus)               90      13      42    12.2    59.4     8.6
IR Bishop (WI)                   63      11      48    12.2    54.9    10.6
Z Khan (India)                  127      24      75    12.0    62.3    14.4
SJ Harmison (Eng/ICC)            86      23      49    11.8    56.9    10.2
Mushtaq Ahmed (Pak)              72      16      59    11.7    54.4    12.7
S Venkataraghavan (India)        76      12      64    11.7    54.9    13.6
M Muralitharan (ICC/SL)         164      56      67    11.7    63.8    12.3
AA Mallett (Aus)                 50      13      43    11.6    49.8    10.0
Mohammad Sami (Pak)              56      14      49    11.6    51.0    11.1
DL Underwood (Eng)              116      35      45    11.6    59.2     8.8
RC Motz (NZ)                     56       3      60    11.5    50.8    13.6
RGD Willis (Eng)                128      55      28    11.5    60.0     5.4
EAS Prasanna (India)             84      20      37    11.5    55.1     7.7
JB Statham (Eng)                 87      28      38    11.4    55.3     7.9
AA Donald (SA)                   94      33      37    10.7    52.5     7.5
MS Kasprowicz (Aus)              54      12      25    10.6    46.2     5.7
AR Caddick (Eng)                 95      12      49    10.4    51.1    10.0
JM Anderson (Eng)               127      47      34    10.4    54.0     6.5
Waqar Younis (Pak)              120      21      45    10.2    52.6     8.7
Shahadat Hossain (Ban)           65      17      40    10.2    46.3     8.8
Shoaib Akhtar (Pak)              67      13      47    10.1    46.1    10.3
Umar Gul (Pak)                   67       9      65     9.9    45.5    14.2
M Ntini (SA)                    116      45      32     9.8    50.4     6.2
RM Hogg (Aus)                    58      13      52     9.8    43.2    11.7
GP Wickramasinghe (SL)           64       5      51     9.4    42.6    11.3
I Sharma (India)                 81      28      31     9.2    43.9     6.5
PR Adams (SA)                    55      15      35     9.0    39.4     8.0
BS Bedi (India)                 101      28      50     9.0    44.8    10.0
EJ Chatfield (NZ)                54      33      21     8.6    37.4     4.8
M Dillon (WI)                    68       3      43     8.4    38.7     9.4
DK Morrison (NZ)                 71      26      42     8.4    39.0     9.1
S Ramadhin (WI)                  58      14      44     8.2    36.3     9.9
CD Collymore (WI)                52      27      16     7.9    34.1     3.7
DBL Powell (WI)                  57       5      36     7.8    34.5     8.2
CA Walsh (WI)                   185      61      30     7.5    42.1     5.4
ARC Fraser (Eng)                 67      15      32     7.5    34.1     7.0
GD McGrath (Aus)                138      51      61     7.4    39.0    11.5
MJ Hoggard (Eng)                 92      27      38     7.3    35.6     7.8
Danish Kaneria (Pak)             84      33      29     7.1    33.9     6.0
LR Gibbs (WI)                   109      39      25     7.0    35.3     4.9
FH Edwards (WI)                  88      28      30     6.6    31.8     6.2
TM Alderman (Aus)                53      22      26     6.5    28.4     6.0
DE Malcolm (Eng)                 58      19      29     6.1    26.8     6.5
PCR Tufnell (Eng)                59      29      22     5.1    22.7     4.9
MS Panesar (Eng)                 68      23      26     4.9    22.4     5.7
AL Valentine (WI)                51      21      14     4.7    20.2     3.3
BS Chandrasekhar (India)         80      39      22     4.1    19.3     4.6
CS Martin (NZ)                  104      52      12     2.4    11.8     2.4

Comments:
As you say, it's kind of silly to predict an average based on a high score, but predicting a high score based on average could have some use to it.

Sorting the table by estimated high score minus actual high score we see the biggest "underachievers" are Bradman, Kallis, Chanderpaul, S Waugh, Sutcliffe, Border, M Waugh, Tendulkar, Armanath, Prior.
The fact that Bradman is on top shows a flaw of the exponential model: there are other batsman who get out and time constraints that make batting for long enough to score 475 difficult. Some of the other batsman on the list have excuses too: they are all-rounders or bat at number 5 etc. It does amuse me that S Waugh is seen as not going on with it from this data, whereas at the time it was M Waugh that was perceived as having this problem (maybe that was before Mark scored his 150?).

At the other end, the biggest overachiever? Wasim Akram.

I guess it would be better to compare the actual distributions of scores to the theoretical one based on average so you're not just using one datapoint to label a player as not-going-on-with-it but I think this is interesting anyway.
 
I suppose I really should normalise the difference. Do you have any insight into what the correct thing to divide by is here so that we are finding the batsman who underachieves in high score for their skill level, not absolute underachievement?

If I calculate (estHS-HS)/HS the new list of underachievers is Willis, Collymore, Kasprowicz, Chatfield, Trueman, Giles, Oldfield, Kelly, Mackay, Prior. Also M Waugh is a bigger underachiever than S Waugh by this measure.
 
I'm not sure it measures quite what you're after, but an idea would be to just calculate a sort of p-value: What is the probability of having a highest score of at least the batsman's actual highest score? Which is 1 - (1 - exp(-HS/avg))^N.

Wasim Akram comes out on top at 0.0017, followed by Gillespie 0.0020. At the other end, Bob Willis has an absurdly high p-value of 0.999992, and Corey Collymore 0.9993. Probably they're hurt a bit by batting at number eleven and being left stranded so often.

(Everyone else is pretty sensible, though the numbers aren't a perfect fit: about 10% of the dataset are at p > 0.95).
 
I think Bob Willis's absurdly high p-value is entirely because of being stranded!

If you look at his innings list on statsguru I think he has a high score of 56 runs between dismissals (for all-time records of this type see http://cricketarchive.com/Archive/Articles/1/1610.html although that's not very up to date) in 73 completed innings.

1-(1-exp(-56/11.5))^73 = 0.43, so his "high score" is about right.

I'm not willing to go through any more innings lists, so I think I'll leave this investigation here.
 
Nice to see you back, however fleetingly, David.

I did a practically identical exercise to this a few years ago, but never got around to writing it up. There was a companion piece about how you just need to know how many innings a bowler has bowled in and how many wickets he's taken overall to estimate, e.g., his 5WI count using the Poisson distribution. That never saw the light of day, either.

A few notes:

(1) A simple extension is to calculate the expected number of, e.g., 100s in a career of a given length and a given mean, which is

exp(-1/avg*100)*N

So, e.g., someone with Tendulkar's average should have scored 51.25 100s (he actually got 51). I get r2 = 0.946 for your 50+ dismissals dataset for this.

(1a) As an aside, you could, of course, try to predict the number of ducks you should expect using this method, but that draws attention to the inadequacy of the constant hazard assumption and consequent exponential approximation when it comes to the beginning of batsmen's innings (Tendulkar should have 6 ducks; he has 14; r2 = 0.616 across the dataset).

(2) I'm not convinced that using the batsman's average as the mean of your distribution does you any favours when you then compare your model with reality. That's because the average, by accounting for not-out innings, effectively estimates a batsman's expectation of runs in a world in which not-outs don't exist. So you end up modelling that world and, when you compare it to the actual record, the not-outs muck things up ever so slightly. I can think of some quite involved ways of getting around this problem, but that would take away the fun of the really simple model providing an impressively good approximation of reality. So a quick fix would be to use RPI rather than avg as your mean. If you do that, you get a slightly improved fit (r2 = 0.753), though you move from systematically slightly overestimating the HS to systematically slightly underestimating it.

(3) Pedantically, I think I take issue with the suggestion that batsmen who have a lower HS than would be expected from their average are underachievers; a narrower range of achievement for a given average suggests a slightly more consistent career and, since innings-to-innings consistency is weakly a good thing, I think I say the lower the HS the better.
 
Hi Gabe!

The under/over-achiever terminology was Martin's, not mine....

I did wonder about using RPI - I think it was you who first pointed out to me that it's a better predictor of the next innings than the average? But I'm too lazy to change my ways when all it does is squeeze a couple of R-squared points out of everything. (I feel that there has to be a better predictor than RPI, though again it wouldn't make large changes....)

On ducks and centuries: I've written on this topic before! e.g., here, here (this one has the next-simplest hazard function that I would use to incorporate ducks properly), here.
 
To give a very weak defense of myself, I put quotation marks around "underachiever" the first time I wrote it. I meant underachiever in the field of scoring large scores.

Without having thought about it much, I think I'd be pretty agnostic about whether I prefer consistent/inconsistent batsmen. Gabe, is there a reason that you would prefer to have a consistent batsman? Does that somehow lead to more won/drawn games?
 
bernard kachoyan left an excellent comment on the wrong post:

"I think a more elegant way to look at this is to consider extreme value theory. The maximum of a set of N exponential random variables can be shown to approach a Gumbel distribution for N large. The expected value of that distribution is similarly given, for N large, by (G + Ln(N))/A where A is the average of the distribution and G is Euler’s constant (about 0.58). This gives a relationship between the expected maximum and the average of the individual distribution which doesn’t rely on your arbitrary factor of ½. Note that your expression for the maximum is approximately ((ln 2+ Ln(N))/Afor large N. Since ln(2)= 0.69 this is close to the expression above so perhaps there is a deeper result here that I haven’t noticed.
All this obviously begs the question of whether there is a correlation between the max/mean ratio and the number of innings that these formulation suggest, but that is the subject of another post."

My response:

This is really interesting - I didn't know anything about limiting distributions of maximum values and had never heard of the Gumbel distribution.

(Clerical note: you should be multiplying by the mean of the exponential distribution, not dividing.)

What I am effectively doing with my (not actually arbitrary!) 1/2 is asking for the median of the almost-Gumbel distribution, which here is avg*(-ln(ln(2)) + ln(N)). I haven't proved that my original expression is approximately equal to this for large N, but it is clearly true when I empirically plot one against the other (differences of "predicted" high scores are less than half a run). I guess you lost a log somewhere in your algebra!

Anyway, the Gumbel distribution is skewed to the right, with the mean larger than the median, so your suggested method of using the mean of the Gumbel distribution results in higher "predicted" highest scores.

median: avg*(-ln(ln(2)) + ln(N))
mean: avg*(gamma + ln(N)).

I prefer using the median - I like having a number here that'll have around half of all batsmen below it and half above. (Actually only 45% of the batsmen in the dataset are above the predicted-by-Gumbel-median highest score; 38% are above the predicted-by-Gumbel-mean highest score.) But I can imagine people's tastes being different here.
 
Filling in some of the algebra that I skipped yesterday:

My original post has

HS = -avg * ln(1 - 0.5^(1/N)).

Now, for N large, 1/N is small, and we can approximate 0.5^(1/N) by its Taylor series truncated after the linear term. I.e.,

0.5^(1/N) = exp((1/N)*ln(1/2))
~ 1 + (1/N)ln(1/2)
= 1 - ln(2) / N.

So,

-avg*ln(1 - 0.5^(1/N)) ~ -avg*ln(ln(2)/N)
= avg*[ln(N) - ln(ln(2))],

which is the median of the Gumbel distribution.
 
that's what I get for posting too late at night.
 
Post a Comment

Subscribe to Post Comments [Atom]





<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]