Algorithms as Co-Researchers: Exploring Meaning and Bias in Qualitative Research

Qualitative researchers now have access to more advanced algorithmic technologies in their toolkit compared to 70 years ago, which has revolutionized the way research data is transformed. However, having advanced tools alone does not guarantee superior research outcomes. It is crucial to acknowledge the potential biases that may arise from algorithmic processes and realize the reciprocal relationship between the researcher and algorithms. We have outlined five phases in algorithmic qualitative research and provided examples of how the creative meaning-making process can be made transparent to researchers. Ultimately, we view algorithmic analysis as a dynamic collaboration between the researcher and algorithms, with both parties contributing equally to the research process.

‘Crunching the numbers’ through advanced algorithmic digital technologies can be well-understood; we are all familiar with housing numbers in a digital spreadsheet. In the age of big data, analytics and artificial intelligence, the aspiration towards – and often a belief in—that ‘crunching the numbers offer helpful rigour and objectivity is becoming popular. But how could digital technologies assist qualitative researchers? In this book chapter from the Cambridge Handbook of Qualitative Digital Research, we argue that this aspiration is more illusion than reality; there is a danger that both qualitative and quantitative researchers may attribute data with objectivity that it does not possess, with severe consequences for both parties. The book chapter explores how qualitative researchers could benefit from close attention to how meaning emerges through a reflexive dance between researcher and algorithms, where the latter can become co-researchers in the process.

Augmenting traditional qualitative methods with advanced algorithmic tools – a phenomenon we call algorithmic qualitative research – raises essential epistemological and methodological questions for researchers. As a revealing example, Jones (Steven E. Jones) offers an arresting vignette that encapsulates some challenges for qualitative researchers using computational analysis. It is of a 1949 meeting between Father Roberto Busa (a Jesuit priest) and Thomas J. Watson (CEO of IBM) in the IBM HQ office. In the meeting, Busa secured Watson’s support for building a comprehensive index of some 11 million medieval Latin words in St Thomas Aquinas’s and related authors’ works. This was not to be just any index. All the words were to be ‘lemmatised’ – arranged in alphabetical order and grouped according to their various multiple forms, along with their verbal context and original location. This created almost limitless ways for subsequent researchers to interpret the totality of these works. The successful endeavour that followed is credited by many as the birth of humanities computing.

Jones’s story juxtaposes a priest-academic qualitative researcher with IBM, an organisation with an iconic space-related, political, commercial, military, and industrial reach that perhaps best symbolises the scientistic approach dominant during the early Cold War. In doing so, Jones has revealed some inherent tensions in using computer-based data analysis to build qualitative understanding. Far from an uncomplicated hitching of IBM’s computational might to augment or expedite a qualitative investigation, Jones has shown how Busa’s creation of his index was a delicate and unfolding sociotechnical accomplishment that used technological affordances and cultural practices to ultimately shape the data and the meanings that ensued so that data and meaning became inseparable: ‘The “founding moment” of digital humanities . . . was the creation of a radically transformed, reordered, disassembled, and reassembled version of one of the world’s most influential philosophies’ (Stephen Ramsay).

In the late 1950s, Busa conducted a later project with IBM: a computerised analysis of the fragmented Dead Sea Scrolls. This involved ‘experiment[ing] with a form of literary data processing, an open-ended, step-wise (algorithmic), and iterative process of dissolving and reconstituting the texts as linguistic data in forms that could be rearranged and analysed in any number of ways’ (Jones, 2016: 144), in which ‘“filling in the gaps” as in a “crossword puzzle”, but also potentially filling holes in the story of how the text got fragmented, was an added result of the process’ (2016: 147). Jones described this process as exposing new dimensions in the composition of the strata of the cultural record itself. Busa’s collaboration with IBM is still noteworthy today. It is probably the first example of a qualitative researcher engaging deeply and reflexively with the inherent tension between the materiality of data (words and symbols in his case as well as ours) manipulated by the computer and the possible meanings of the data that emerge as interpreted by the researcher when the data is radically transformed, reordered, disassembled, and reassembled.

More than 70 years later, qualitative researchers now have access to more advanced algorithmic technologies in their methodological toolkit, as digital innovations have continuously evolved since Father Busa collaborated with IBM in 1949. However, we argue that advancements in the technological researcher toolkit do not necessarily mean more advanced qualitative research. We highlight the dangers of glossing over the reflexive dance of meaning-making, which results in problematic opacity and bias. We elaborate on the crucial methodological challenges in this recursive algorithm-researcher dance/partnership: the need for mindful acknowledgement and exposure of the various points in the partnership at which one may have shaped the other. We identify five phases of algorithmic qualitative research to discuss examples of how the creative meaning-making – of algorithm-derived, interpretation-derived, and selection-derived meaning – inherent in each stage of the partnership can be made transparent to qualitative researchers. In this way, qualitative algorithmic analysis is conceptualised as a reflexive dance between researcher and algorithms, with the latter becoming co-researchers.

Continue reading the entire chapter