Fighting AI

There is an article in Monday’s issue of the Daily Telegraph concerning a lawsuit filed by the New York Times against Microsoft and Open AI that, on the face of it, is about imitating copyright news articles. But what is at stake is whether an artificial intelligence company could ‘train’ its software on the works of, say, Salman Rushdie, and then produce new Salmon Rushdi titles without paying the author any royalty. The article which bears the title “Silicon Valley’s mimicry machines are trying to erase authors” is written by Andrew Orlowski who is a technology journalist who writes a weekly Telegraph column every Monday. He founded the research network Think of X and previously worked for The Register. 

Andrew Orlowski

Orlowski says, “Silicon Valley reacts to criticism like a truculent toddler throwing its toys out of the pram. But acquiring a bit of humility and self-discipline may be just what the child needs most. 

So the US tech industry should regard a lawsuit filed last week as a great learning experience.

The New York Times last week filed a copyright infringement against Microsoft and Open AI. 

The evidence presented alleges that ChatGPT created near-identical copies of the Times’ stories on demand, without the user first paying a subscription or seeing any advertising on the Times’ site. 

ChatGPT “recites Times content verbatim, closely summarizes it, and mimics its expressive style”, the suit explains.

In other words, the value of the material that the publisher generates is entirely captured by the technology company, which has invested nothing in creating it.

This was exactly the situation that led to the creation of copyright in the Statute of Anne in 1710, which first established the legal right to copyright for an author. Then, it was the printing monopoly that was keeping all the dosh.

The concept of an author, a subjective soul who viewed the world in a unique way, really arrived with the Enlightenment.

Now, the nerds of Silicon Valley want to erase it again. Attempts to do just that have already made them richer than anything a Stationer’s Guild member could imagine.

“Microsoft’s deployment of Times-trained LLMs (Large Language Models) throughout its product line helped boost its market capitalization by trillions of dollars in the past year alone,” the lawsuit notes, adding that OpenAI’s value has shot from zero to $90bn. 

With Open AI’s ChatGPT models now built into so many Microsoft products, this is a mimicry engine built on a global scale.

More ominously, the lawsuit also offers an abundance of evidence that “these tools wrongly attribute false information to The Times”. The bots introduce errors that weren’t there in the first place, it claims. 

They “hallucinate”, to use the Cambridge Dictionary’s word of the year. Publishers who are anxious about the first concern – unauthorised reproduction – should be even more concerned about the second.

Would a publisher be happy to see their outlet’s name next to a ChatGPT News response that confidently asserts, for example, that Iran has just launched cruise missiles at US destroyers? Or at London? 

These are purely hypotheticals but being the newspaper that accidentally starts World War III is not something that can be good for the brand in the long run.

Some midwit pundits and academics portrayed the lawsuit merely as a tactical licensing gambit. 

This year both Associated Press and the German giant Axel Springer have both cut licensing deals with Open AI. The New York Times is just sabre rattling in pursuit of a better deal, so the argument goes.

In response to the lawsuit, OpenAI insisted it respects “the rights of content creators and owners and [is] committed to working with them to ensure they benefit from AI technology and new revenue models”.

However, the industry is worried about much more than money.

Take, for example, the fact that the models that underpin ChatGPT need only to hear a couple of seconds of your child’s voice to clone it authentically. AI does not need to return the next day to perfect their impression. After that, it has a free hand to do what it will with its newfound ability.

So, the economic value of a licensing deal is impossible to estimate beforehand. And once done, it cannot be undone. As one publishing technology executive puts it, “you can’t un-bake the cake”.

Previous innovations in reproduction, from the photocopier to Napster, were rather different beasts, as the entrepreneur and composer Ed Newton-Rex noted this week. Past breakthroughs were purely mechanical or technological changes. But this new generation of AI tools marry technology with knowledge.

“They only work *because* their developers have used that copyrighted content to train on,” Newton-Rex wrote on Twitter, since rebranded as X. (His former employer, Stability AI, is also being sued for infringement).

Publishers and artists are entitled to think that without their work, AI would be nothing. This is why the large AI operations – and the investors hoping to make a killing from them – should be getting very nervous. They have been negligent in ignoring the issue until now.

“Until recently, AI was a research community that enjoyed benign neglect from copyright holders who felt it was bad form to sue academics,” veteran AI journalist Timothy B Lee wrote recently on Twitter. “This gave a lot of AI researchers the mistaken impression that copyright law didn’t apply to them. “It doesn’t seem out of the question that AI companies could lose these cases catastrophically and be forced to pay billions to plaintiffs and rebuild their models from scratch.”

Would wipe-and-rebuild be such a bad thing?

Today’s generative AI is just a very early prototype. Engineers regard a prototype as a learning experience too: it’s there to be discarded.  Many more prototypes may be developed and thrown away until a satisfactory design emerges. A ground-up rebuild can in some cases be the best thing that can happen to a technology product. There’s certainly plenty of room for improvement with this new generation of AI models. 

A Stanford study of ChatGPT looking at how reliable the chatbot was when it came to medicine found that less than half (41 percent) of the responses to clinical conditions agreed with the known answer according to a consensus of physicians. The AI gave lethal advice 7 per cent of the time.

A functioning democracy needs original reporting and writing so that we all benefit from economic incentives for creativity. We must carry on that Enlightenment tradition of original expression. 

Some may find such arguments pompous and any piety from the New York Times difficult to swallow. But there are bigger issues at stake. 

A society that gives up on respect for individual expression, and chooses to worship a mimicry machine instead, probably deserves the fate that inevitably awaits.”

One thought on “Fighting AI

  1. 1. How does this use of AI circumvent plagiarism rules (“I just was paraphrasing”)?
    2. Shouldn’t AI be termed an oxymoron?

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.