Sunday, January 13, 2008

Opening partnerships, and a Kiwi record

This entry is inspired from a line from The Best of the Best. On Hobbs and Sutcliffe, Charles Davis writes that, "[e]ach was a great batsman in his own right, but even that is not quite enough to account for their performances together".

Given the individual averages of two openers, how much would we expect their partnerships to average? And which opening pairs do the "most better" that you would expect?

To answer these questions, I took all opening pairs who opened the batting at least 15 times together. I ordered each pair so that the first had the lower average of the two (so that, in the tables and equations below, avg1 is the lower individual average, and avg2 is the higher). Common sense suggests that the average partnership should be more determined by the lower individual average, since that batsman is more likely to get out first.

Note that I've used individual averages as openers when doing this analysis.

I then threw the data into gretl, an econometrics program. Since there are two independent variables (one for each opening batsman), I can't easily make a pretty graph. You'll just have to cope with equations and tables. Here is some of the output:

Modèle 1: Estimation en MCO avec 97 observations 1-97
Variable dépendante: avg_part

const -7,60135 5,30561 -1,433 0,15526
avg1 0,484575 0,144117 3,362 0,00112 ***
avg2 0,766951 0,157437 4,871 <0,00001 ***

Moyenne de la variable dépendante = 41,6219
Écart-type de la var. dép. = 12,6076
Somme des carrés des résidus = 7632,06
Erreur standard des résidus = 9,01067
R2 non-ajusté = 0,499844

You'll note that my computer is French. The word moyenne is 'mean', écart-type is 'standard deviation', and the other words are close to their English counterparts. If you don't know what they mean, that is not important.

The table tells us that, "on average", we expect that the average opening partnership (avg_part) should obey the following equation:

avg_part = 0,484575*avg1 + 0,766951*avg2 - 7,60135.

The R2 value says that roughly half of the variance in the data-set is explained by this model.

Obviously the equation isn't valid everywhere — if both openers average zero, you would not expect them to score negative runs! But roughly 47 of the 97 opening pairs in the sample do better than the equation, and 50 do worse, so it appears to be pretty much "in the middle".

It is surprising (to me, at least) that the co-efficient of avg2 is so much higher than that of avg1. This says that it is the opener with the higher average who more determines the size of the average partnership. I'm at a bit of a loss to explain this. Perhaps openers with lower averages have lower strike rates (so while they don't score as many runs, they don't get out first)?

Now we get onto the pairs who do better than they should. In the following table, I've given the individual averages-as-openers, the runs scored together, the number of partnerships, 'obs' the observed average partnership, 'exp' the expected average partnership based on the equation above, and the ratio of the observed to expected.

opener1 opener2 avg1 avg2 runs inns obs exp ratio
T Franklin J Wright 23,00 38,12 1543 28 55,11 32,78 1,68
Javed Omar Nafees Iqbal 22,08 25,60 665 19 35,00 22,73 1,54
P Roy V Mankad 31,71 40,74 868 16 57,87 39,01 1,48
J Stollmeyer A Rae 41,94 46,18 1349 21 71,00 48,14 1,47
B Murray G Dowling 23,92 31,55 786 20 39,30 28,19 1,39
C Cowdrey G Pullar 42,42 43,84 906 15 64,71 46,58 1,39
C McDonald A Morris 39,40 45,69 949 15 63,27 46,53 1,36
J Hobbs H Sutcliffe 56,37 61,11 3249 38 87,81 66,58 1,32
Imran Farhat Taufeeq Umar 33,10 39,30 754 15 50,27 38,58 1,30
Sadiq Mohammad Majid Khan 34,93 42,23 1391 26 53,50 41,71 1,28

And it's a Kiwi pair who finish first! I suppose that if you analyse enough Test data, you'll eventually find New Zealand coming first in something.

My guess that openers with lower averages score slower certainly applies to Trevor Franklin, who is the fourth-slowest batsman of all-time according to Davis's list (qual. 1000 runs or 2000 balls faced; average over 20).

It may just be coincidence that some opening pairs do well in that table — perhaps they both had a good run of innings while batting together, or maybe they batted against weaker teams (I haven't tried adjusting for strength of bowling attack). But it may also be that they bring out the best in each other. Or, as Davis suggests in the case of Hobbs and Sutcliffe, that they held a psychological edge over their opponents when together.

When Stuart sees the other end of the table, he will be happy to see Graeme Wood coming dead last.

opener1 opener2 avg1 avg2 runs inns obs exp ratio
M Elliott M Taylor 35,32 43,50 721 23 31,35 42,88 0,73
M Dekker G Flower 15,86 29,30 357 22 16,23 22,56 0,72
B Woodfull B Ponsford 50,90 54,18 860 22 40,95 58,62 0,70
G Gooch T Robinson 43,88 44,97 621 19 32,68 48,15 0,68
R Simpson L Hutton 25,92 56,48 477 15 31,80 48,28 0,66
E McMorris C Hunte 26,86 45,07 548 21 26,10 39,98 0,65
B Pocock B Young 22,93 32,13 378 21 18,00 28,15 0,64
Wasim Jaffer V Sehwag 35,82 51,29 619 21 29,48 49,09 0,60
Hannan Sarkar Javed Omar 20,66 22,08 207 18 11,50 19,34 0,59
A Hilditch G Wood 31,56 33,61 354 18 19,67 33,47 0,59

It is also interesting that Javed Omar comes both second and second-last. Mark Dekker has easily the worst average of any opening batsman who's opened the innings 15 times.

For all the hugging, Langer and Hayden did slightly worse than would be expected, with a ratio of 0,92. Their average opening partnership of 52,08 is quite good (22nd on the list), but they each have excellent individual opening averages (48,94 and 52,66). The Langer/Hayden and Boycott/Amiss pairs are the only ones to have an average partnership of over 50 and a ratio below 1.

Figures are based on Tests 1 to 1858, that is up to the first Test between New Zealand and Bangladesh.

How would Gavaskar-Chauhan partnership in tests have fared according to the calculations? As expected, better, or worse?

Just curious, and a poor one with stats at that. Thanks.
Hey Soulberry. Chauhan and Gavaskar did significantly better than you would expect, with a ratio of 1.14. Chauhan's average was less than 32, so to have an average opening stand with Gavaskar of over 50 was quite good.

If you rank the opening pairs by the ratio, they come 25th out of the 97.

Actually it could be a little bit higher, up around 22nd: for some reason my average partnership for them disagrees with that Cricinfo list (same number of runs, but I have their average as 52.81 whereas Cricinfo has 53.75). Perhaps there's a retired hurt affecting the numbers somewhere.

Either way, they did better than you would have expected.
Thanks David.

Is there a way to apply a test of significance to the better ( or worse) figure?

I'm a bit silly with stats and my last big involvement with statistical analysis (doing it all by myself) was when I wrote my thesis some 20+ years ago! I used to apply a test of significance...some P values and t and the like...(am I on the right track?) see if the observed difference is significant or not.

And thank you for your views at TCWJ. I truly appreciate it.
That's a good question. Unfortunately I don't have any formal stats training (for the first time ever, I'm regretting not doing any stats courses in undergrad!), and this leaves me with some basic gaps in my knowledge of the subject.

I think to do a significance test I'd need to know the distribution of opening partnerships. It's not (I think) a simple case of saying that ratios above 1.2 are significant, because the cut-off value for, say, p = 0.05 will vary with the number of partnerships. (eg, a pair who bats once together and averages 200 will have a very high ratio, but this won't be statistically significant, because it's a one-off.)

I might be able to model this numerically and come up with some p-values, but off the top of my head there are some finicky details that I don't know how to solve....

Given how many times Chauhan and Gavaskar batted together, I'd guess that their ratio is statistically significant, but showing that seems to be a bit too hard for me.
Very interesting. Where do Greenidge-Haynes, and Saeed Anwar - Aamir Sohail come in? How about Mark Taylor and Slater? They had some huge partnerships? So did Marsh and Boon, for that matter.

I wonder i fChauhan really could have been slower than Gavaskar :-)
Greenidge-Haynes: 0.98, pretty much bang in the middle, 49th.

Anwar-Sohail: 0.94, 59th.

Taylor-Slater: 1.10, 28th.

Marsh-Boon: 1.08, 33rd.
To be picky - while you hve use average as opener, did you exclude the partnership in question itself?

In case I worded that confusingly, what I meant is :

To predict the average partnership of, say, Langer and Hayden, based on their individual averages as openers when _not_ opening together.
No, I didn't exclude innings when they batted together when calculating individual averages. That's not really relevant to what I want to look at here, which is the average size of the partnership compared to the individual averages.

Now that I think about it, it might be relevant to consider the individual averages ONLY when batting with their particular partner. I'm happy with what I've done though, since the overall opening average gives a better reflection on how good the opener was.
hmm... this is totally mind-boggling crazy... math was a nightmare for me :)... I can't believe that Hobbs & Sutcliffe didn't end up on the top... very interesting. Are you a mathematician by profession or undertake anything related to the cricket world statistics?
Hi Scorpi. Hobbs and Sutcliffe have easily the best average opening partnership of any regular opening pair (87,8), but because one averaged over 55 and the other over 60, you'd expect them to have a very high average opening partnership! Franklin and Wright overall weren't as good - average partnership 55 - but Franklin only averaged 23, so their average partnership is really quite remarkable.

I did my undergrad in maths and physics. Cricket statistics is just a hobby for me.

I don't know if you have the archives at hand in the correct format for this, but I'm very interested in trying to derive a statistic called "the-averaged-fall-of-wicket-quotient" for openers. Heres how it works. One of the two opening batsmen is obviously the first wicket to fall in any innings. His partner carries on till he too, is dismissed (in the limiting case, he carries his bat). Now, assign a number to opening batsmen corresponding to the fall of wickets. If you are the first to be dismissed, you get a 1. If you are the second, you get a 2, and so on. Add these up over all innings and divide by the total number of innings. So if a batsman is dismissed in four innings as being the first, second, first and third wickets to fall, his quotient is 1.5. It gives some indication of how deep the opening batsman lands up batting into each innings. I'm not quite sure of how to handle not-outs and instances of carrying the bat. The latter must be assigned some suitably high value, and the former needs to note how many other wickets fell in the same innings (and perhaps the result).

Just an idea.
Sorry, that should be 1.75 above.
That's an interesting idea, Samir. It shouldn't be too hard to calculate, so I'll write a post on it with the results either later tonight or tomorrow.
Post a Comment

Subscribe to Post Comments [Atom]

<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]