In the UK to deal with lack of exams during the Covid lockdown, they turned to automated decision making via algorithms. How did that work out for you then?

Deciding important matters using algorithms has been part of life for a long time.  Yet the current A-level results fiasco reminds us that data science and data-driven decision making is just one tool that we use in coming to acceptable conclusions. Whenever there is automation based on data inputs you have to ask a series of critical questions:

  • What if the algorithm is wrong?
  • What if it is based on incomplete, partial, and estimated data?
  • What if there is massive variation in the context of collected input data?
  • What if the historical input data is known to contain significant bias?
  • What if you are delivering highly sensitive results into a politically charged environment?

But the current chaos reminds us that we’re in the early days when it comes to the practical application of data science to improve decision making. Just about anyone inside the data science world knows that 80% of the job is “data cleansing“. Essentially deciding which data you trust, and which you don’t, filling in gaps in the data, and augmenting the limited data set with contextual information. Similarly, people often talk euphemistically about “training the algorithms” and the use of machine learning to build confidence in data. This is often a long and exhausting process. but entirely necessary to ensure you don’t get tripped up by any of the many traps contained in large data sets.

You discount these fundamental issues at your peril. And yet here we are.