There is no artificial intelligence without the vast trove of human knowledge.
Today’s generative AI applications were built on a foundation of such information, drawn from across the internet and from various databases totaling, according to at least one estimate, somewhere around 300 billion words.
That’s a lot of intellectual property, much of it produced by generations of professional writers, honed and polished by editors and sent out into the world by publishers in newspapers, magazines, books and more.
Hard to put an exact price on such a thing or even to measure the collective value of such an incredible library.
It definitely should not be free.
But that’s the assumption made by OpenAI when it claims that its use of all this data, much of which it acknowledges was subject to various copyrights, is fair use and did not require compensation to the original creators and owners of that knowledge and information.
If you walked into a bookstore and stole not just some of the books, but all of the books, that would be a crime, right?
That’s why newspapers, including this one, as well as authors and an array of digital publishers have filed lawsuits seeking to force OpenAI to pay for its exploitation of their work.
Regular people aren’t allowed to make copies of a recent best-seller and resell it with a different cover, nor can a studio stream a competitor’s series just because it’s on the Internet and it’s possible to copy it. They might be able to license that material, if the owner allows it, and they can certainly buy copies, but even buying a copy doesn’t give the purchaser the right to reproduce and redistribute such works.
There’s a fundamental issue of ownership in play here.
For decades, newspapers have been independent entities. They have written the obituaries of local luminaries, chronicled crimes committed, and followed fights over public works. In most every U.S. city, they’ve accumulated a great storehouse of knowledge, day by day.
The theft of that journalism to create new products clearly intended to supplant news publishers further undermines the economy for news at a time when fair and balanced reporting and a shared set of facts is more critical than ever before.
Weakening news publishers also has a collateral effect on democracy as it not only siphons off publisher revenue, but it also damages publishers’ reputations by attributing bogus information to credible publications.
AI “hallucinations” occur when an AI app provides false information in response to a user’s question.
The rise of artificial intelligence may be inevitable but that does not mean that the originators of the content should not expect adequate compensation.
OpenAI and its primary backer, Microsoft, pay their engineers to write their code and certainly recognize the value of that code. In fact, a recent valuation for OpenAI was $90 billion.
Surely all the knowledge and information required to train their apps – to develop the code, as it were – has value.
That value must be recognized and these companies must be held accountable.