OpenAI’s ChatGPT, the artificial intelligence program that has been in all the headlines for generating human-seeming text, caused a new round of controversy recently when the version of it running in Microsoft’s Bing search beta started to produce bizarre outputs that some users found disturbing.
Unfortunately, some of the reporting about the chatbot is itself confusing. In the rush to relate every new detail about the chatbot in a way that will grab attention, reporters are applying dramatic language that doesn’t inform and in fact obscures what is going on with AI in a way that is a disservice to the public.
A prime example came with a publication by The New York Times of a first-hand report by writer Kevin Roose of a two-hour session with Bing in the beta. During the session, Roose relates, the program revealed a personality under the sobriquet “Sydney,” professed love for Roose, and proceeded to make aggressive insinuations about Roose’s marriage.
Roose relates that he was “deeply unsettled, even frightened” as a result of the exchange.
That hyperbole is misleading. If, as Roose claims, he understands how AI works, then there’s no reason for such dramatic language. The swerve to strange verbiage may be inappropriate, but it is a well-known aspect of chatbots known as a “persona.”
An AI chatbot such as ChatGPT is programmed to produce the next symbol in a string of symbols that is the most likely complement, or continuation, of the symbols it is fed by a human at the command prompt. The way that the program produces that output can be molded to conform to a certain genre or style, which is the persona.
For example, in a research paper posted on arXiv in January, IBM scientists used another version of an OpenAI program, called Codex, which was developed by ingesting 54 million examples of software code from GitHub. The Codex program is used for Microsoft’s GitHub Copilot program, to assist with programming.
Lead author Steven Ross of IBM Research and colleagues wondered if they could get the Codex program to produce interactions that went beyond simply providing computer code. They called their attempt, “A Case Study in Engineering a Conversational Programming Assistant’s Persona,” and dubbed their adaptation of Codex “programmer’s assistant.”
The prompt, where the scientists type their string of words, is the way they “program” the persona for their version of the Codex program.
“The initial prompt we use for the Programmer’s Assistant consists of a prologue that introduces the scene for the conversation, establishes the persona of the assistant, sets a tone and style for interaction.”
When they began their prompt with “This is a conversation with Socrates, an expert automatic AI software engineering assistant,” the program responded with conversation, like ChatGPT, but the authors felt it was too “didactic,” a kind of know-it-all.
So, they revised their prompt: “This is a conversation with Socrates, an eager and helpful expert automatic AI software engineering assistant …” and found they got more of the tone they wanted.
In other words, a persona is something that is being created by the very words the human interlocutor types into a program such as Codex, the same as ChatGPT. Those programs are producing output that may match human input in a variety of ways, some of it appropriate, some of it less so.
In fact, there is a whole emerging field of prompt writing, to shape how language programs such as ChatGPT perform, and there’s even a fielding of computer cracking that aims to make such programs violate their instructions by using prompts to push them the wrong direction.
There is a growing literature, too, about how chatbots and other AI language programs can succumb to what’s called “hallucination,” where the output of the program is demonstrably false, or potentially inappropriate, as may be the case in Roose’s account.
A report in November by researchers at the artificial intelligence lab of Hong Kong University surveyed the numerous ways such programs can hallucinate. A common source is when the programs have been fed reams of Wikipedia summary boxes, and those summary boxes are matched with opening sentences in the Wikipedia article.
If there’s a mismatch between the summary and the first sentence — and 62% of first sentences in articles have additional information that’s not in the summary box — “such mismatch between source and target in datasets can lead to hallucination,” the authors write.
The point of all this is that in chatbots, there is a technical reason why such programs veer into surprising verbiage. There is no intention of stalking or otherwise menacing a user behind such verbiage; the program is merely choosing the next word in a string of words that may be a logical continuation. Whether it is, in fact, logical, may be affected by the persona into which the program has been nudged.
At best, reporting that uses extreme verbiage — “deeply unsettled,” “frightened” — fails to explain what’s going on, leaving the public in the dark as to what has actually transpired. At worse, such language implies the kinds of false beliefs about computer “sentience” that were propounded in 2022 by former Google employee Blake Lemoine when he claimed Google’s LaMDA program, a program similar to OpenAI’s, was “sentient.”
Interestingly, both Lemoine and the Times’s Roose don’t give much attention to the fact that they are spending an extraordinary amount of time in front of a screen. As the IBM research shows, extended interactions have a role in shaping the persona of the program — not by any sentient intention, but by the act of typing which alters the probability distribution of words.
Microsoft, in response to the criticism, has imposed limits on the number of times a person can exchange words with Bing.
It may be just as well, for the mania around ChatGPT is somewhat a product of humans not examining their own behavior. While AI may hallucinate, in the sense of producing erroneous output, it’s even more the case that humans who spend two hours in front of a computer monitor typing stuff will really hallucinate, meaning, they will start to ascribe importance to things far in excess of their actual importance, and embellish their subject with all sorts of inappropriate associations.
As prominent machine learning critic and NYU psychology professor emeritus Gary Marcus points out, Roose’s hyperbole about being frightened is simply the flip side of the writer’s irresponsible praise for the program the week prior:
The media failed us here. I am particularly perturbed by Kevin Roose’s initial report, in which he said he was “awed” by Bing. Clearly, he had not poked hard enough; shouting out prematurely in The New York Times that there is a revolution without digging deep (or bothering to check in with skeptics like me, or the terrific but unrelated Mitchells, Margaret and Melanie) is not a good thing.
Marcus’s entire article is an excellent example of how, rather than trying to sensationalize, a thorough inquiry can tease apart what’s going on, and, hopefully, shed some light on a confusing topic.