Who bats best?

Cricket commentators often talk of changes in batting quality through the ages. Or they say that batting order matters to averages (or vice-versa). But is there anything in these questions?

animation
sport
Published

October 9, 2016

A couple of years ago, I downloaded the top 200 averages for each batting order from the wonderful stats engine at espnCricinfo, before:

For reference, here are the first six rows of the resulting data.

Code
library(tidyverse)
library(gganimate)
library(glue)
  
batOrder <- read_csv("BattingOrder.csv") |>  
  mutate(
    Name = word(Player, start = 1L, end = -2L),
    fullCountry = word(Player, -1),
    Country = str_sub(fullCountry, 2,-2)) %>% 
  filter(
    Country %in% c("Aus", "Ban", "Eng", "India", "NZ", "Pak", "SA", "SL", "WI")
    ) |> 
  mutate(
    Start = as.integer(str_sub(Span, 1, 4)),
    Decade = 10*trunc(Start/10),
    Name = str_replace_all(Name, "'", " ") 
    ) |> 
  select(Name, Country, Start, Decade, Ave, Innings = Inns, Runs, Bat)

glimpse(batOrder, width = 70)
Rows: 873
Columns: 8
$ Name    <chr> "B Mitchell", "JB Hobbs", "RB Simpson", "L Hutton", …
$ Country <chr> "SA", "Eng", "Aus", "Eng", "India", "Eng", "Eng", "S…
$ Start   <int> 1931, 1908, 1961, 1938, 2002, 1924, 1986, 1996, 1948…
$ Decade  <dbl> 1930, 1900, 1960, 1930, 2000, 1920, 1980, 1990, 1940…
$ Ave     <dbl> 65.04, 58.14, 58.03, 57.74, 55.19, 54.38, 53.84, 52.…
$ Innings <dbl> 26, 87, 28, 120, 21, 20, 21, 41, 23, 90, 105, 74, 35…
$ Runs    <dbl> 1431, 4768, 1567, 6236, 1159, 979, 1023, 2002, 1106,…
$ Bat     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…


We can then analyse the data by batting order, using the splendid gganimate.

Code
batOrder |> 
  ggplot(
    aes(
      x = Decade,
      y = Ave,
      color = Country, 
      size = Innings
      )
    ) +
  geom_point(alpha = 1) +
  labs(
    x = "Last decade of the batsman's career",
    y = ""
    ) +
  ggtitle(
    'The best players who have ever batted at {closest_state} in the order',
    subtitle = 'Average when batting at that position'
    ) + 
  transition_states(
    states = Bat,
    transition_length = 2,
    state_length = 1
    ) + 
  ease_aes('cubic-in-out')