There is a fascinating article by Ben Davis, the National Art critic for ArtNet.com. In it he demonstrates the dangers of e.g. ChatGPT. Put simply, if ChatGPT were a human, it would have fabricated information which it presented as fact. "Fabricated" implies a mental process of intent to tell lies and machines cannot form that intent. They can only do as they are told.
When Davis did a search using ChatGPT, the results referred to several genuine authorities in the field. It also "created" a body of work for each and produced summaries and critiques of that work. To be clear: none of the articles or papers attributed to the recognised authorities actually exist.
I'm not going to repeat the article: it's best you read it for yourself. But I am going to review the risks that the findings demonstrate and I'm going to use the article to demonstrate why automated research cannot, in its present state, be relied upon.
"We Asked ChatGPT About Art Theory. It Led Us Down a Rabbit Hole So Perplexing We Had to Ask Hal Foster for a Reality Check" is here: https://news.artnet.com/art-world/chatgpt-art-theory-hal-foster-2263711
We know that where data is fact it is reliable. Or is it? In the RegTech world," John Smith, 25, of Blackacre was convicted of fraud" is a fact. Put it into your KYC analysis. But that report is valid only provided that it is maintained. What if John Smith appeals and his appeal against conviction is successful. The problem here is the source of data. Most convictions are added to KYC databases from published media reports, so-called "adverse media" reports. But the general media only reports successful appeals if that news is likely to gain them readers. So "Thief succeeds in appeal" is far less likely to appear than "child molester's conviction overturned." Simply, outrage sells.
I cannot find a country that has a single, government-run, database of decisions in the criminal courts. In some countries, even the official announcement of convictions omits any information that would, or would tend to, identify the guilty. While the failure to identify is a different, but important, point the lack of a comprehensive record of convictions and appeals, in all criminal courts, not only the senior courts, is a massive fetter on a system that is capable of destroying lives.
What it means is that the "Facts" that the data aggregators that create KYC databases cannot be regarded as facts. They are indicators upon which banks and fintechs should base their own enquiries.
But that's not what happens in the real world. This is: "There's an entry. It'll cost too much to check its accuracy. Reject."
If we can't rely on data to be factual at the time we use it then how much further away are we from being able to rely on data which is interpreted by algorithms?
I have often written about the problems that arise because of differences in language. And it's in this context that I want to refer to the article by Ben Davis for the simple reason that it is a remarkable display that proves the thesis "just because you are using English words doesn't mean you are writing English."
The thing is that Davis is clearly educated, and educated people like to play with English, to stretch and twist it to a point just before it breaks, then enjoying the twang as it snaps back into shape. That was fine when readers noticed the joke, laughed at it and moved on. Today, however, the jokes, and much bad English, races around the world at the speed of rumour - and is just as likely to stick.
For example his "colleague chatted me". Kudos for using "colleague" but "chatted me" is a nonsensical expression that makes sense only if the reader is on the same linguistic plane. "chatted" is the wrong part of grammar to use in that context.
There's "looped back" and "takeaways" which work in some derivative forms of English but not others. He also ascribes personality to "the AI."
Davis knows when he's introducing something that's not quite right, like a buzzword. He puts quotation marks around “category collapse”. And he sees as "deadpan comedy" ChatGTP's assertion that " it should be noted that AI is not influencing the creation of new works of medieval art."
anyone using it should understand it is also unreliable,” Naomi Rea writes. “If you take what it says at face value, it could be a dumb and dangerous mistake.”
Let's be clear: I really enjoy Davis' writing. It's witty, irreverent but with a message and he knows how to take English to just this side of the cliff's edge. The ChatGPT piece is a superb article that makes for very enjoyable reading but only because I'm a human and I can interpret. A machine reading it would need to have both my internal database and thought processes to interpret it as I interpret it. And it would need Davis' to interpret it as he meant it. Even then, the two might not be the same.
That's a simple explanation of why financial services companies, fintechs, prosecutors, regulators and courts must not regard interpretative algorithmic analysis as reliable. The adverse consequences of failure can be enormous.
As Davis points out, Large Language Models are often designed to produce "what sounds right."
That brings us back to Turing and, because much of my own work about "artificial intelligence" starts from Turing's basic assumption that if machines are to achieve intelligence, first they must learn English," my oft-repeated question "but which English?" And, therefore, "what sounds right to whom?"
Davis ends his article with a paragraph that I have only one argument with. He says "this is an application for sounding like an expert, not for being an expert—which is just so, so emblematic of our whole moment, right? Instead of an engine of reliable knowledge, Silicon Valley has unleashed something that gives everyone the power to fake it."
What's my dispute? It's not a Silicon Valley problem - it's a structural problem across society. Paper experts, holding a sheaf of certificates, often gained through examination that recognises rote learning rather than comprehension, abound. Worse, those that ride the surf of buzzwords convince others that they have a deep understanding of their subject.
But they don't. Yet, they provide the greater number of "sources" and so their often shallow and sometimes plain wrong opinions are what search engines find.
ChatGPT might make stuff up, but perhaps even worse is that it can't discriminate between good and bad. Why? Because it's not got enough, relevant, data. Until it's been taught right from wrong (in a non-ethical, moralistic, sense it's always going to fail. And until it can identify every twisted use of English, in all its forms, dialects and accents, it's not going to be able to interpret properly, either.
Life-changing decisions are influenced by such interpretations, even when they are not artificially created by a machine.
Great care should be taken. Do Due Diligence on the machine.