Systems that consume vast quantities of human creations to emulate our behaviors are pushing forward the cutting edge in AI. Dall-E and Stable Diffusion learned to generate images from prompts, allowing us to view renditions of superheroes as Renoir might have painted them.
Large language models (LLMs) perform a similar trick with language, pumping out realistic press releases or code reviews as performed by a pirate. The most recent iteration, ChatGPT, has drawn immense attention as people have explored the extent of its capabilities in endlessly amusing variations.
So, what does all this mean for progress toward true AI? Is this the real deal? A step back from that? Or a simple parrot, our human artifacts merely reflected back at us through the funhouse mirror of code?
Gangsters and sonnets alike
What this latest class of algorithms clearly demonstrates is a combinatorial understanding of concepts. When asked to explain computer science algorithms as a wise-guy from a 1940’s gangster movie, ChatGPT accomplishes both tasks. We can vary the character it speaks as, or the topic we want it to speak to.
In another example, Alan Turing in his “Imitation Game” paper introduced the topic of the Turing Test, where computers can be said to think when a human investigator can’t distinguish them from a human. Turing gave as an example prompt the request to “write a sonnet on the subject of Forth Bridge.” When I posed the task to GPT, its response included this stanza:
“The bridge is grand, a symbol of its time,
A beacon to the Scottish people proud,
It stands today, a testament of pride,
Reminds us of the days when dreams were loud.”
Not every rhyme and meter worked — recall that GPT has never heard a sound but inferred these concepts from predicting words in sentences — but it clearly endeavored to construct iambic pentameter and follow the appropriate rhyme scheme. It stayed on topic. It was written poetically.
Compelling cognitive abilities
In my limited search, I couldn’t find any prior usage of “dreams were loud” as a metaphor (only people complaining about being woken by their dreams). It’s an obvious metaphor, relatively shallow as they go, but it’s genuine.
We can point to the many poems that fed GPT-3 and question what is truly novel in its output. But if the building blocks are known, the intersections are unique and new. And putting known building blocks together into novel patterns is a compelling cognitive ability.
Although the training data volumes involved are massive, the regularities were all discovered by these networks — the rules of sonnets and limericks, the linguistic quirks of pirate-ese. Programmers did not carefully generate training sets for each task. The models found the rules independently.
Where does GPT-3 lack? The above stanza is adequate as poetry but doesn’t surprise or challenge us. When it imitates a pirate, it doesn’t add new nuance to the role. GPT-3 was trained on approximating the most probable words in sentences. We can push it towards more random outputs — not the most likely but the 5th most likely — but it strongly follows the trail of what’s been said repeatedly.
It can explain known tasks well but struggles to give novel suggestions and solutions. It lacks goals, its own impetus. It lacks a meaningful distinction between what’s true versus a likely thing to be said. It has no long-term memory: Generating an article is possible, but a book does not fit in its context.
More nuanced language understanding
At each new scaling factor of language models and each research paper hot off the press, we observe a more nuanced understanding of language. Its outputs get more varied, and its abilities more extensive. It uses language in increasingly obscure and technical domains. But the limitations and the tendency towards banality persist.
I have become increasingly convinced of how powerful self-attention is as a neural network concept for finding patterns in a complex world. On the flip side, the gaps in the computer’s understanding become clearer in comparison to the rapid improvement in so many areas.
Looking at GPT’s understanding of pronouns in semantically ambiguous situations, its sense of humor, or its complex sentence structures, I’d surmise that even the current version is sufficient for general language understanding. But there’s some other algorithm as yet un-invented, or at least a particular combination of existing algorithms and training tasks that are needed to approach actual intelligence.
Understanding language: Identifying meaningful patterns
To return to the initial prompt: Whether it’s the unscientific wonder at seeing a Shakespearean sonnet arise from the dust of simple word prediction tasks, or the steady erosion of the human gap in myriad tasks to plumb the depth of artificial understanding of language, the language models in use today are not just a parlor trick. These processes do not just parrot human language but find the meaningful patterns within it — be it syntactic, semantic, or pragmatic.
Yet, there’s something more going on in our head, even if it’s just the same techniques self-applied at another level of abstraction. Without some clever new technique, we’ll continue banging our heads on the limitations of our otherwise impressive tools. And who can say when the bolt of inspiration will strike there?
So no, true AIs have not yet arrived. But we’re significantly closer than we were before, and I predict when it does, some variation of self-attention and contrastive learning will be a significant portion of that solution.
Paul Barba is Chief Scientist at Lexalytics, an InMoment Company.
DataDecisionMakers
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!