There is no artificial intelligence without the vast trove of human knowledge. Today’s generative AI applications were built on a foundation of such information, drawn from across the internet and from various databases totaling, according to at least one estimate, somewhere around 300 billion words.

That’s a lot of intellectual property, much of it produced by generations of professional writers, honed and polished by editors and sent out into the world by publishers in newspapers, magazines, books and more.

Hard to put an exact price on such a thing or even to measure the collective value of such an incredible library.

It definitely should not be free.

But that’s the assumption made by OpenAI when it claims that its use of all this data, much of which it acknowledges was subject to various copyrights, is fair use and did not require compensation to the original creators and owners of that knowledge and information.

If you walked into a bookstore and stole not just some of the books, but all of the books, that would be a crime, right?

That’s why newspapers, including this one, as well as authors and an array of digital publishers have filed lawsuits seeking to force OpenAI to pay for its exploitation of their work.

Regular people aren’t allowed to make copies of a recent best-seller and resell it with a different cover, nor can a studio stream a competitor’s series just because it’s on the Internet and it’s possible to copy it. They might be able to license that material, if the owner allows it, and they can certainly buy copies, but even buying a copy doesn’t give the purchaser the right to reproduce and redistribute such works.

There’s a fundamental issue of ownership in play here.

For decades, newspapers have been independent entities. They have written the obituaries of local luminaries, chronicled crimes committed, and followed fights over public works. In most every U.S. city, they’ve accumulated a great storehouse of knowledge, day by day.



Source link

author-sign