Financial Times announced A contract with Open AI Monday to license its world-class journalism to provide training and information Chat GPT models. It joins Excel Springer and the Associated Press in similar deals, where OpenAI Reportedly offers millions For the right to use the Content. However, ChatGPT was trained on lots of other web scraped content that OpenAI didn’t pay for. So why is OpenAI paying for some datasets and not others?
OpenAI’s licensing deals seem to send a clear message: We’re going to use your content anyway, so sign a deal with us or back out. The main benefit of the licensing deal seems to be a prominent place in ChatGPT’s responses. Some publishers also want to strengthen relationships with the next major information distribution channel before it takes over. However, OpenAI seems to be using a lot of content from publishers anyway.
OpenAI is already partially training its own AI models.Publicly available dataAccording to CTO Meera Murthy, that seems deliberately vague. What is publicly available data anyway? This phrase assumes that anything free to read on the Internet is free to produce in ChatGPT. For example, Gizmodo is part of OpenAI’s “publicly available data.” Our website was cached. 34,000 times on the webtext of GPT-2 dataset, the final dataset OpenAI used to train an AI model.
Gizmodo is free to readers because of the advertising on this webpage. If readers can access our content through ChatGPT that breaks our business model. The New York Times, which is significantly more frequent in GPT-2’s web text dataset, Sued OpenAI for copyright infringement on this matter.
A content licensing deal with OpenAI seems to be the only way for publishers to stay relevant in the AI era. In a ___ News for the newspaperFinancial Times Group CEO John Ridding says the deal will “broaden the reach” of their work while “offering early insights into how content is brought to life through AI.”
“The thing about AI is that it’s not really artificial intelligence,” said Matthew Buttrick, a lawyer representing Sarah Silverman and other book authors suing OpenAI, in an interview with Gizmodo. “It’s human intelligence that’s taken from one place, divorced from its creators, then this big tech company puts a price on it and sells it to somebody else.”
Butterick is a plaintiff in six copyright lawsuits against AI companies. He’s also a writer, coder and designer, so he says he understands how AI could threaten those industries. Typically, his cases center around claims that AI simultaneously consumes the work of creators and threatens their livelihoods.
OpenAI’s licensing deals raised an eyebrow around ChatGPT’s free-to-use content. Tech companies have argued that creative AI is a “fair use” of copyrighted works because it transforms them into something new. AI World has also argued that it is using a model similar to Google Search, which preserves copyrighted material to create a useful, information-seeking tool. Like Google, AI chatbots have recently started incorporating hyperlinks. Ultimately, a court will have to decide whether creative AI is “fair use.”
OpenAI did not immediately respond to Gizmodo’s request for comment.
Book authors and publishers aren’t the only ones OpenAI is taking content from. The New York Times recently reported that OpenAI trained GPT-4 on Over. Millions of hours of transcribed YouTube videos. Days before the report came out, YouTube’s CEO said using its videos for AI training would be a “clear violation” of its policies.
Licensing OpenAI’s content muddies the waters of the debate. The company is somehow using Internet content for free, while also paying others for their work. Other tech companies, such as Apple, have reportedly been more proactive about paying for all their training data. Adobe has reportedly paid up. $3 per minute video To train your AI video generator.
However, it is not clear whether even a one-time payment is sufficient to obtain AI training data. We’re talking about a tool that could potentially revolutionize the media industry for writers, audio and video producers, and more. Signing a contract with OpenAI might guarantee you a good spot in ChatGPT results, but it looks like the AI chatbot is using your content anyway. At least for now, AI companies are eager to use everything on the Internet and later question its legitimacy.
A version of this article was originally published on Gizmodo..
Credit : qz.com