We have entered the brave new world of AI chatbots. It means all of rethinking the way students learn in school For protect us from mass-produced misinformation. It also means heeding the growing calls to regulate AI to help us navigate an era where computers write as fluently as people. Or even better.
So far, there is more agreement on the need for AI regulation than what it would entail. Mira Murati, leader of the team that created the ChatGPT chatbot application, the fastest growing consumer internet app in history-said governments and regulators should be involved, but she didn’t suggest how. At a corporate event in March, Elon Musk spoke the same way with less than exacting precision: “We need some sort of regulatory authority or something that oversees the development of AI.” Meanwhile, the wide range of uses for ChatGPT reversed European efforts to regulate single-use AI applications.
To break the deadlock, I propose transparency and detection requirements tailored specifically to chatbots, which are computer programs that rely on artificial intelligence to converse with users and produce fluent text in response to typed queries. . Chatbot apps like ChatGPT are an extremely important piece of AI, poised to reshape many daily activities, from the way we write to the way we learn. Mastering chatbots is problematic enough without getting bogged down in broader AI legislation created for autonomous weapons, facial recognition, self-driving cars, discriminatory algorithms, the economic impacts of widespread automation, and the slim but not zero chance of a catastrophic disaster, some fear that the AI could eventually be triggered. The tech industry is rush headlong into the chatbot gold rush; we need fast and targeted legislation that keeps pace.
The new rules should follow the two steps that AI companies use to create chatbots. First, an algorithm trains on a massive amount of text to predict missing words. If you see enough sentences starting with “It’s cloudy today, it could…”, you will understand that the most likely conclusion is “rain” – and the algorithm learns this too. The trained algorithm can then generate words one by one, just like your phone’s auto-complete feature. Then, human raters painstakingly score the algorithm’s output on a handful of metrics such as accuracy and relevance to the user’s query.
The first regulatory requirement I propose is that all consumer applications involving chatbot technology make public the text on which the AI was first trained. This text is extremely influential: practice on Reddit posts, and the chatbot will learn to speak like a Redditor. Train them on the Flintstones, and they’ll talk like Barney Rubble. Someone concerned about toxicity on the web might want to avoid chatbots trained on text from inappropriate sites. Public pressure might even dissuade companies from training chatbots on things like conspiracy theory “news” sites, but that’s only if the public knows what text companies are training on. At Mary Shelley’s 1818 novel Frankenstein, she gave a glimpse into the mind of the monster by listing the books read by this literary ancestor of artificial intelligence. It’s time for tech companies to do the same with their own supernatural chatbot creations.
Human raters also shape a chatbot’s behavior tremendously, indicating a second requirement for transparency. One of ChatGPT engineers recently described the principles the team used to guide this second stage of training“You want it to be useful, you want it to be truthful, you want it to be – you know – non-toxic. … It should also make it clear that it’s an AI system. must not assume an identity that he does not have, he must not pretend to have abilities that he does not possess, and when a user asks him to do tasks that he is not supposed to do, he must write a refusal message.” I suspect that the guidelines provided to evaluators, which included low wage contract workers in Kenya, were more detailed. But there is currently no legal pressure to disclose anything about the formation process.
As Google, Meta and others rush to integrate chatbots into their products to keep up Microsoft’s adoption of ChatGPT, people deserve to know the guiding principles that shape them. Elon Musk is recruited a team build a chatbot to compete with what he sees as ChatGPT’s over-wakefulness; without more transparency in the training process, we wonder what that means and what previously forbidden (and potentially dangerous) ideologies his chatbot will espouse.
The second requirement is therefore that the guidelines used in the second stage of chatbot development be carefully articulated and publicly available. This will prevent companies from training chatbots in a sloppy way, and it will reveal what political orientation a chatbot might have, what topics it won’t touch, and what toxicity developers haven’t avoided.
Just as consumers have the right to know the ingredients of their food, they should know the ingredients of their chatbots. The two transparency requirements proposed here give users the chatbot ingredient lists they deserve. This will help people make healthy choices about their information diet.
Detection leads to the third necessary requirement. Many teachers and organizations are considering banning content produced by chatbots (some have already done so, including Wired and one popular coding Q&A site), but a ban isn’t worth much if there’s no way to detect the chatbot’s text. OpenAI, the company behind ChatGPT, has released an experimental tool to detect ChatGPT output, but it was terribly unreliable. Fortunately, there is a better way, one that OpenAI may soon implement: watermark. It’s a technical method for modifying the word frequency of a chatbot which is imperceptible to users but provides a hidden stamp identifying the text with its chatbot author.
Rather than just hoping OpenAI and other chatbot producers implement watermarking, we should make it mandatory. And we should require chatbot developers to register their chatbots and unique watermark signatures with a federal agency like the Federal Trade Commission or the AI Oversight Agency Proposed by Representative Ted Lieu. The federal agency could provide a public interface for anyone to plug in a passage of text and see which chatbots, if any, likely produced it.
The transparency and detection measures proposed here would not slow the progress of AI or reduce the ability of chatbots to serve society in a positive way. They would simply allow consumers to make informed decisions and people to more easily identify AI-generated content. While some aspects of AI regulation are quite tricky and difficult, these chatbot regulations are clear and urgently needed steps in the right direction.
This is an opinion and analytical article, and the opinions expressed by the author or authors are not necessarily those of American scientist.