Pappus' plane - cricket stats: Opening partnerships, and a Kiwi record

Sunday, January 13, 2008

Opening partnerships, and a Kiwi record

This entry is inspired from a line from The Best of the Best. On Hobbs and Sutcliffe, Charles Davis writes that, "[e]ach was a great batsman in his own right, but even that is not quite enough to account for their performances together".

Given the individual averages of two openers, how much would we expect their partnerships to average? And which opening pairs do the "most better" that you would expect?

To answer these questions, I took all opening pairs who opened the batting at least 15 times together. I ordered each pair so that the first had the lower average of the two (so that, in the tables and equations below, avg1 is the lower individual average, and avg2 is the higher). Common sense suggests that the average partnership should be more determined by the lower individual average, since that batsman is more likely to get out first.

Note that I've used individual averages as openers when doing this analysis.

I then threw the data into gretl, an econometrics program. Since there are two independent variables (one for each opening batsman), I can't easily make a pretty graph. You'll just have to cope with equations and tables. Here is some of the output:


Modèle 1: Estimation en MCO avec 97 observations 1-97
Variable dépendante: avg_part

      VARIABLE       COEFFICIENT        ERR. STD         T           p. critique
  const                -7,60135          5,30561      -1,433   0,15526
  avg1                  0,484575         0,144117      3,362   0,00112 ***
  avg2                  0,766951         0,157437      4,871  <0,00001 ***

  Moyenne de la variable dépendante = 41,6219
  Écart-type de la var. dép. = 12,6076
  Somme des carrés des résidus = 7632,06
  Erreur standard des résidus = 9,01067
  R2 non-ajusté = 0,499844

You'll note that my computer is French. The word moyenne is 'mean', écart-type is 'standard deviation', and the other words are close to their English counterparts. If you don't know what they mean, that is not important.

The table tells us that, "on average", we expect that the average opening partnership (avg_part) should obey the following equation:

avg_part = 0,484575*avg1 + 0,766951*avg2 - 7,60135.

The R² value says that roughly half of the variance in the data-set is explained by this model.

Obviously the equation isn't valid everywhere — if both openers average zero, you would not expect them to score negative runs! But roughly 47 of the 97 opening pairs in the sample do better than the equation, and 50 do worse, so it appears to be pretty much "in the middle".

It is surprising (to me, at least) that the co-efficient of avg2 is so much higher than that of avg1. This says that it is the opener with the higher average who more determines the size of the average partnership. I'm at a bit of a loss to explain this. Perhaps openers with lower averages have lower strike rates (so while they don't score as many runs, they don't get out first)?

Now we get onto the pairs who do better than they should. In the following table, I've given the individual averages-as-openers, the runs scored together, the number of partnerships, 'obs' the observed average partnership, 'exp' the expected average partnership based on the equation above, and the ratio of the observed to expected.


opener1         opener2       avg1    avg2    runs  inns  obs     exp     ratio
T Franklin      J Wright      23,00   38,12   1543  28    55,11   32,78   1,68
Javed Omar      Nafees Iqbal  22,08   25,60   665   19    35,00   22,73   1,54
P Roy           V Mankad      31,71   40,74   868   16    57,87   39,01   1,48
J Stollmeyer    A Rae         41,94   46,18   1349  21    71,00   48,14   1,47
B Murray        G Dowling     23,92   31,55   786   20    39,30   28,19   1,39
C Cowdrey       G Pullar      42,42   43,84   906   15    64,71   46,58   1,39
C McDonald      A Morris      39,40   45,69   949   15    63,27   46,53   1,36
J Hobbs         H Sutcliffe   56,37   61,11   3249  38    87,81   66,58   1,32
Imran Farhat    Taufeeq Umar  33,10   39,30   754   15    50,27   38,58   1,30
Sadiq Mohammad  Majid Khan    34,93   42,23   1391  26    53,50   41,71   1,28

And it's a Kiwi pair who finish first! I suppose that if you analyse enough Test data, you'll eventually find New Zealand coming first in something.

My guess that openers with lower averages score slower certainly applies to Trevor Franklin, who is the fourth-slowest batsman of all-time according to Davis's list (qual. 1000 runs or 2000 balls faced; average over 20).

It may just be coincidence that some opening pairs do well in that table — perhaps they both had a good run of innings while batting together, or maybe they batted against weaker teams (I haven't tried adjusting for strength of bowling attack). But it may also be that they bring out the best in each other. Or, as Davis suggests in the case of Hobbs and Sutcliffe, that they held a psychological edge over their opponents when together.

When Stuart sees the other end of the table, he will be happy to see Graeme Wood coming dead last.


opener1         opener2       avg1    avg2    runs  inns  obs     exp     ratio
M Elliott       M Taylor      35,32   43,50   721   23    31,35   42,88   0,73
M Dekker        G Flower      15,86   29,30   357   22    16,23   22,56   0,72
B Woodfull      B Ponsford    50,90   54,18   860   22    40,95   58,62   0,70
G Gooch         T Robinson    43,88   44,97   621   19    32,68   48,15   0,68
R Simpson       L Hutton      25,92   56,48   477   15    31,80   48,28   0,66
E McMorris      C Hunte       26,86   45,07   548   21    26,10   39,98   0,65
B Pocock        B Young       22,93   32,13   378   21    18,00   28,15   0,64
Wasim Jaffer    V Sehwag      35,82   51,29   619   21    29,48   49,09   0,60
Hannan Sarkar   Javed Omar    20,66   22,08   207   18    11,50   19,34   0,59
A Hilditch      G Wood        31,56   33,61   354   18    19,67   33,47   0,59

It is also interesting that Javed Omar comes both second and second-last. Mark Dekker has easily the worst average of any opening batsman who's opened the innings 15 times.

For all the hugging, Langer and Hayden did slightly worse than would be expected, with a ratio of 0,92. Their average opening partnership of 52,08 is quite good (22nd on the list), but they each have excellent individual opening averages (48,94 and 52,66). The Langer/Hayden and Boycott/Amiss pairs are the only ones to have an average partnership of over 50 and a ratio below 1.

Figures are based on Tests 1 to 1858, that is up to the first Test between New Zealand and Bangladesh.

# posted by David Barry : 10:26

Comments:

How would Gavaskar-Chauhan partnership in tests have fared according to the calculations? As expected, better, or worse?

Just curious, and a poor one with stats at that. Thanks.

# posted by

Soulberry : 14 Jan 2008, 9:01:00 pm

Hey Soulberry. Chauhan and Gavaskar did significantly better than you would expect, with a ratio of 1.14. Chauhan's average was less than 32, so to have an average opening stand with Gavaskar of over 50 was quite good.

If you rank the opening pairs by the ratio, they come 25th out of the 97.

Actually it could be a little bit higher, up around 22nd: for some reason my average partnership for them disagrees with that Cricinfo list (same number of runs, but I have their average as 52.81 whereas Cricinfo has 53.75). Perhaps there's a retired hurt affecting the numbers somewhere.

Either way, they did better than you would have expected.

# posted by

David Barry : 14 Jan 2008, 9:20:00 pm

Thanks David.

Is there a way to apply a test of significance to the better ( or worse) figure?

I'm a bit silly with stats and my last big involvement with statistical analysis (doing it all by myself) was when I wrote my thesis some 20+ years ago! I used to apply a test of significance...some P values and t and the like...(am I on the right track?)...to see if the observed difference is significant or not.

And thank you for your views at TCWJ. I truly appreciate it.

# posted by

Soulberry : 17 Jan 2008, 9:30:00 am

That's a good question. Unfortunately I don't have any formal stats training (for the first time ever, I'm regretting not doing any stats courses in undergrad!), and this leaves me with some basic gaps in my knowledge of the subject.

I think to do a significance test I'd need to know the distribution of opening partnerships. It's not (I think) a simple case of saying that ratios above 1.2 are significant, because the cut-off value for, say, p = 0.05 will vary with the number of partnerships. (eg, a pair who bats once together and averages 200 will have a very high ratio, but this won't be statistically significant, because it's a one-off.)

I might be able to model this numerically and come up with some p-values, but off the top of my head there are some finicky details that I don't know how to solve....

Given how many times Chauhan and Gavaskar batted together, I'd guess that their ratio is statistically significant, but showing that seems to be a bit too hard for me.

# posted by

David Barry : 17 Jan 2008, 8:14:00 pm

Very interesting. Where do Greenidge-Haynes, and Saeed Anwar - Aamir Sohail come in? How about Mark Taylor and Slater? They had some huge partnerships? So did Marsh and Boon, for that matter.

I wonder i fChauhan really could have been slower than Gavaskar :-)

# posted by

Optimistix : 19 Jan 2008, 6:56:00 pm

Greenidge-Haynes: 0.98, pretty much bang in the middle, 49th.

Anwar-Sohail: 0.94, 59th.

Taylor-Slater: 1.10, 28th.

Marsh-Boon: 1.08, 33rd.

# posted by

David Barry : 19 Jan 2008, 7:29:00 pm

To be picky - while you hve use average as opener, did you exclude the partnership in question itself?

In case I worded that confusingly, what I meant is :

To predict the average partnership of, say, Langer and Hayden, based on their individual averages as openers when _not_ opening together.

# posted by

Optimistix : 20 Jan 2008, 12:43:00 pm

No, I didn't exclude innings when they batted together when calculating individual averages. That's not really relevant to what I want to look at here, which is the average size of the partnership compared to the individual averages.

Now that I think about it, it might be relevant to consider the individual averages ONLY when batting with their particular partner. I'm happy with what I've done though, since the overall opening average gives a better reflection on how good the opener was.

# posted by

David Barry : 20 Jan 2008, 1:00:00 pm

hmm... this is totally mind-boggling crazy... math was a nightmare for me :)... I can't believe that Hobbs & Sutcliffe didn't end up on the top... very interesting. Are you a mathematician by profession or undertake anything related to the cricket world statistics?

# posted by

Anonymous : 23 Jan 2008, 4:57:00 pm

Hi Scorpi. Hobbs and Sutcliffe have easily the best average opening partnership of any regular opening pair (87,8), but because one averaged over 55 and the other over 60, you'd expect them to have a very high average opening partnership! Franklin and Wright overall weren't as good - average partnership 55 - but Franklin only averaged 23, so their average partnership is really quite remarkable.

I did my undergrad in maths and physics. Cricket statistics is just a hobby for me.

# posted by

David Barry : 23 Jan 2008, 5:48:00 pm

David,

I don't know if you have the archives at hand in the correct format for this, but I'm very interested in trying to derive a statistic called "the-averaged-fall-of-wicket-quotient" for openers. Heres how it works. One of the two opening batsmen is obviously the first wicket to fall in any innings. His partner carries on till he too, is dismissed (in the limiting case, he carries his bat). Now, assign a number to opening batsmen corresponding to the fall of wickets. If you are the first to be dismissed, you get a 1. If you are the second, you get a 2, and so on. Add these up over all innings and divide by the total number of innings. So if a batsman is dismissed in four innings as being the first, second, first and third wickets to fall, his quotient is 1.5. It gives some indication of how deep the opening batsman lands up batting into each innings. I'm not quite sure of how to handle not-outs and instances of carrying the bat. The latter must be assigned some suitably high value, and the former needs to note how many other wickets fell in the same innings (and perhaps the result).

Just an idea.

# posted by

Samir Chopra : 24 Jan 2008, 8:20:00 pm

Sorry, that should be 1.75 above.

# posted by

Samir Chopra : 24 Jan 2008, 8:21:00 pm

That's an interesting idea, Samir. It shouldn't be too hard to calculate, so I'll write a post on it with the results either later tonight or tomorrow.

# posted by

David Barry : 24 Jan 2008, 8:34:00 pm

Pappus' plane - cricket stats

Sunday, January 13, 2008

Opening partnerships, and a Kiwi record

About Me

Email

Links

Archives