Lies, damned lies, statistics…and the indiscriminate ramblings of Code Interpreter
ChatGPT’s latest add-on is the antithesis of data science
The thread-bros have been out in force this week, heaping boundless praise on Code Interpreter, the latest extension to OpenAI’s ChatGPT. Its use cases extend beyond data analysis, but that seems to be where much of the hype resides. Riffing off Steve Jobs’ iPhone tagline, the evangelists claim that Code Interpreter is like having a thousand data analysts in your pocket. For the $20/month cost of a ChatGPT Plus subscription, we can all enjoy access to an ‘on-demand team of McKinsey consultants’, tweets someone who presumably has never set foot in a McKinsey office.
Does Code Interpreter spell the end for professional data analysts? Perhaps it heralds the dawn of a new age where all of us, regardless of our background in data science and statistics, get to play the role of data nerd.
I was hopeful ahead of the release, having experienced firsthand how ChatGPT assists with generating code, removing a major bottleneck for data analysts who, like me, find programming a chore. Code Interpreter seems the logical next step: ostensibly, it removes the requirement of programming altogether. Simply upload a spreadsheet (or several) and, with very little prompting, the tool gets to work reading and analysing your data.
My first encounters with Code Interpreter suggest that, even more so than with ChatGPT itself, there are caveats and trade-offs that its fiercest advocates have yet to reckon with.
Trigger-happy data analysis
The ease of use of Code Interpreter is in keeping with ChatGPT’s seamless user interface. Your instructions can be brief, even vague. Code Interpreter is only too eager to rush towards generating charts and metrics that purport to give an accurate picture of your data.
The readiness with which Code Interpreter spits out its findings may seem like a strength, but it promotes an ethic that runs contrary to the rigours of data analysis.
Exploratory data analysis is fraught with danger. When we seek insights by studying patterns in numbers, caution is prudent because patterns mislead as often as they prove meaningful. Charts and metrics can enlighten or lead us astray. The distinctions are unforgivingly fine. It is only by engaging with data mindfully, and by interpreting trends in context, that one can hope to tell the difference. We must also pose our questions selectively and with great care, for only a paltry subset of them result in meaningful answers from a given dataset.
For those hoping Code Interpreter will place them on a fast track to data expertise, the news is sobering. There are no shortcuts. Just as possession of a calculator does not suddenly sharpen your intuition of numbers, Code Interpreter will not have an outsized impact on your data skills. The idea that the tool will suddenly liberate laypeople to deep dive into their household budgets (among the many imagined use cases) is AI solutionism at its worst. It glosses over the root causes of data (and financial) illiteracy, which have never been about paucity of technology.
The truth is more bitter still: unregulated use of Code Interpreter, without the requisite foundations of data science, can do untold damage as we substitute rigour for trigger-happy chart generation. Data is often described as an act of storytelling, but most stories contain fictions. The ease with which we are deceived by flawed empirical claims should put us all on high alert. Code Interpreter is the disinformer’s dream, lending credibility to their falsehoods under the guise of superficial analysis and pretty charts.
It takes a data analyst to bring truth to Code Interpreter
Generative AI has, in many cases, proven itself an amplifier of human talent. I am a more capable programmer due to ChatGPT, no question. The same is not yet true for Code Interpreter and my skills as a data analyst. The reason is by now a familiar one for generative AI: Code Interpreter has the most tenuous grasp of the truth.
When I earnestly sought Code Interpreter’s help with analysing a particular file, we barely got past the initial processing stage because I had reason to doubt that it had even read the data correctly. The more I queried Code Interpreter’s rationale, the more I felt I was being led down a rabbit hole of confabulation (it claimed, falsely, that the discrepancy was due to a difference in indexing conventions). Later on, as Code Interpreter devised metrics that it claimed would be central to my understanding of the data, I was troubled by some of its methods (treating null values as zero when calculating averages, for instance). The more Code Interpreter is permitted to venture into an analysis, the more it seems to resort to senseless regurgitation.
Code Interpreter, like large language models writ large, does not possess its own truth-checking mechanism. Instead it serves up lines of code to accompany its outputs that are open to inspection - a tick for transparency perhaps, but only if one has the expertise to take advantage. The skill and temperament needed to dive into Code Interpreter’s workings, to debug it, and to course correct it onto a more sensible path, all amounts to the endeavour of a data analyst - the very people this tool is apparently poised to displace.
Chatbots work wonders when they are aimed squarely at the programming stage of the data workflow. The division of cognitive labour is clear; the human analyst drives the inquiry, the chatbot is merely the accelerator. The difference with Code Interpreter is that it assumes control of the inquiry, guiding users down avenues it has no grounding in. By granting Code Interpreter dominion over the book-ends of the data analysis workflow - deciding what questions to pose and what models to use at one end, generating and interpreting trends out of context at the other - we find ourselves at the mercy of its half-truths. Wrestling back control requires a firm and disciplined understanding of how to navigate data (as well as the time to reign in fruitless data explorations). It takes nothing less than a professional data analyst, the net gain for whom is far from a given.
Code Interpreter is not the ‘copilot’ that many presume it to be. Maybe it could be. If, rather than arbitrarily spitting out charts and half-baked interpretations, it instead encouraged users to pose meaningful questions, to evaluate the assumptions behind their models, to studiously navigate their analysis, and to ground their conclusions in meaningful contexts, then - and only then - it might just make better data analysts of us all.