An artist's interpretation comparing a human and machine model for learning a new class of alphabet symbols. | Danqing Wang
Researchers have created a computer model that captures humans' unique ability to learn new concepts from just a few examples, a new study in the 11 December issue of Science reports. The model is capable of learning handwritten characters from an array of global alphabets, classifying them into their alphabet category, and drawing new examples that look "right" as judged by humans.
"For the first time, we think we have a machine that learns the way humans do," said study co-author Josh Tenebaum, professor of computational cognitive science at the Massachusetts Institute of Technology, "but this is just a first step. The gap between machine and human learning remains vast and we want to close it."
Recent years have seen steady advances in machine learning, yet the human mind is still best when it comes to solving a diverse array of problems, including those involving learning new concepts and recognizing new objects. People often need just an example or two to learn something in such cases, where machines require tens or hundreds.
What's more, after learning a new concept for the first time, people can typically leverage their problem-solving skills to use the concept in more rich and creative ways.
Tenenbaum and colleagues sought to develop a model that captured these human-learning abilities. They focused on a large class of simple visual concepts — handwritten characters from alphabets around the world — building their model to "learn" this large class of visual symbols, and make generalizations about it, from very few examples.
They researchers call their modeling scheme the Bayesian program learning framework, or BPL. It works by representing individual concepts as simple computer programs. "Learning" under this scheme is a search for the best program under a scoring system based on Bayesian probability.
The Science researchers discuss the steps toward making a machine model learn more like a human. | Brenden Lake
After introducing the BPL approach, the researchers directly compared the performance of adults, their BPL model, and other computational approaches on a set of five difficult concept learning tasks — including generating new examples of characters from an alphabet, like the Tibetan alphabet, seen only seen a few times. The BPL model performed very well on this task (and better than other computational approaches), with just a 3.3% error rate compared to 4.5% for humans, the researchers report.
Another task involved seeing an image of a new character one time, and then having to select that same character from a set of 20 distinct ones. On this particularly difficult job, the BPL model achieved human-level performance while outperforming recent deep learning approaches, the researchers show.
In a teleconference for reporters on 9 December, co-author Ruslan Salakhutdinov, assistant professor in the departments of statistics and computer science at the University of Toronto, explained how using this framework could lead to huge advances in a machine's ability to perform visual recognition tasks.
"Take a company like eBay," Salakhutdinov said. "They're trying to sell new things people post, categories of which are growing every day. These items have to be recognized and categorized before they are sold online. It's currently very hard to do this based on seeing one example of a new category." Applying his team's framework, he said, could improve this process.
Reporters in the teleconference asked the researchers how they thought their work would improve technologies such as smartphones.
"Imagine that your smartphone could recognize when you use a new word," Tenenbaum explained. "What if it could say, 'I don't know what that word means; can you tell me?', and then use its newfound understanding of the word to expand the words it recognizes…If you want a system that can learn words it's never heard before, we think you'll be best off using an approach like the one we developed."
Study co-author, Brenden Lake, a Moore-Sloan Data Science Fellow at New York University, further explained that while his team's model is focused on characters now, the approach underlying it could eventually be broadened to have applications for rapid learning in other types of symbol-based systems (those involving new gestures, for example) and in learning new types of artifacts.