Meta unveiled its most advanced chatbot to date on Friday, called the BlenderBot 3, and is letting all adults in the US have conversations with the bot to help it improve.
BlenderBot 3 is able to engage in general chitchat, says Meta, but also answer the sort of queries you might ask a digital assistant “from talking about healthy food recipes to finding child-friendly amenities in the city.”
By releasing the chatbot to the general public, Meta wants to collect feedback on the various problems facing large language models. Users who chat with BlenderBot will be able to flag any suspect responses from the system, and Meta says it’s worked hard to “minimize the bots’ use of vulgar language, slurs, and culturally insensitive comments.”
“Researchers can’t possibly predict or simulate every conversational scenario in research settings alone,” Meta AI researchers wrote in a Friday blog post.
“The AI field is still far from truly intelligent AI systems that can understand, engage, and chat with us like other humans can,” says Meta. “In order to build models that are more adaptable to real-world environments, chatbots need to learn from a diverse, wide-ranging perspective with people ‘in the wild.'”
Meta has been working to address the issue since it first introduced the BlenderBot 1 chat app in 2020. Initially little more than an open-source NLP experiment, by the following year, BlenderBot 2 had learned both to remember information it had discussed in previous conversations and how to search the internet for additional details on a given subject.
Meta’s chatbot evaluates data from people it speaks with
BlenderBot 3 takes those capabilities a step further by not just evaluating the data it pulls from the web but also the people it speaks with.
When a user logs an unsatisfactory response from the system—currently hovering around 0.16 percent of all training responses—Meta works the feedback from the user back into the model to avoid it repeating the mistake.
The system also employs the Director algorithm which first generates a response using training data and then runs the response through a classifier to check if it fits within a user feedback-defined scale of right and wrong.
“To generate a sentence, the language modeling and classifier mechanisms must agree,” the team wrote. “Using data that indicates good and bad responses, we can train the classifier to penalize low-quality, toxic, contradictory, or repetitive statements, and statements that are generally unhelpful.”
The system also employs a separate user-weighting algorithm to detect unreliable or ill-intentioned responses from the human conversationalist—essentially teaching the system to not trust what that person has to say.
“Our live, interactive, public demo enables BlenderBot 3 to learn from organic interactions with all kinds of people,” the team wrote. “We encourage adults in the United States to try the demo, conduct natural conversations about topics of interest, and share their responses to help advance research.”
BB3 is expected to speak more naturally and conversationally than its predecessor in part thanks to its massively upgraded OPT-175B language model, which stands nearly sixty times larger than BB2’s model.
“We found that, compared with BlenderBot 2, BlenderBot 3 provides a 31 percent improvement in overall rating on conversational tasks, as evaluated by human judgments,” the team said.
“It is also judged to be twice as knowledgeable while being factually incorrect 47 percent less of the time,” the team said. “Compared with GPT3, on topical questions, it is found to be more up-to-date 82 percent of the time and more specific 76 percent of the time.”