Through this collaboration, they’re planning to analyze human brain activity and deep learning algorithms trained on language or speech tasks in response to the same written or spoken texts. In theory, it could decode both how human brains—and artificial brains—find meaning in language. By comparing scans of human brains while a person is actively reading, speaking, or listening with deep learning algorithms given the same set of words and sentences to decipher, researchers hope to find similarities as well as key structural and behavioral differences between brain biology and artificial networks. The research could help explain why humans process language much more efficiently than machines. “What we’re doing is trying to compare brain activity to machine learning algorithms to understand how the brain functions on one hand, and to try to improve machine learning,” says Jean-Rémi King, a research scientist at Meta AI. “Within the past decade, there has been tremendous progress in AI on a wide variety of tasks from object recognition to automatic translation. But when it comes to tasks which are perhaps not super well defined or need to integrate a lot of knowledge, it seems that AI systems today remain quite challenged, at least, as compared to humans.” To do this, they’re using whole brain imaging techniques such as fMRI and magnetoencephalography (a technique used to identify brain activity in response to individual words and sentences down to the millisecond). This allows them to track the brain’s response to words as a function of time. Observing the brain in detail will allow the researchers to see which brain regions are active when they hear a word like “dog” or “table” (for example, it could be the angular gyrus that supposedly helps humans understand metaphors, or the Wernicke’s area that processes the meanings of sounds). Then, they can inspect the algorithm to see if it’s functioning similarly to the part of the brain that they’re analyzing. For example, what properties are the AI picking up from the word of interest? Is it associating that word with how it sounds or how it has been used previously? In previous research, they’ve been able to observe regions of the brain that behave similarly to the way algorithms do for visual representations, word embeddings, and language transformers. For example, King notes that algorithms trained to do character recognition, or transcribe pixels into letters, generate activations correlated with the visual part in the brain. In a study published in the journal Communications Biology in February, Meta AI researchers found that deep learning algorithms trained to predict a blocked out word from the context of the sentence behave most like human brains compared to other algorithms that didn’t have that feature. “This to us is a strong signal—it suggests that trying to predict the future given the past is probably something akin to what the human brain is trying to do,” King says. These models are also able to perform well on a range of tasks outside of predicting the missing word based on context. “And so this is the path we should try to follow to develop deep learning algorithms,” King says. But there are still questions that remain. Specifically, to what extent do we need the innate structures in our brains, as opposed to cultural influences while we’re growing up to be efficient at learning language? And how much data and parameters do you really need to make a language model work? “Kids learn to speak within a couple of years, which is a very small amount of sentences [that they’ve had access to] when you compare this kind of data to what AI systems are typically trained with,” says King. “It suggests that we have architectures inside our brain which allow us to be more efficient at extracting from language data the structure of the world—the meaning of what people are trying to convey.” AI systems, on the other hand, are very good at specific tasks, as opposed to general ones. However, when the task becomes too complicated, even if it’s still specific, or “requires bringing different levels of representations to understand how the world works and what motivates people to think in one way or another,” they tend to fall short, says King. For example, he notes that some natural language processing models still get stumped by syntax. “They capture many syntactic features but are sometimes unable to conjugate the subject and verb when you have some nested syntactic structures in between. Humans have no problem doing these types of things.” “The density of information along with the depth it can carry is a remarkable feature of language,” King adds. This is something the AI of today currently lacks and could explain why they can’t always understand what we’re trying to convey. Being able to have general knowledge of a problem in addition to understanding the emotional or situational context for certain words or phrases may be key to developing better natural conversation AI systems that could one day power future virtual assistants. As for the natural language processing models themselves—the software that actually is trained to try to understand language—a separate team at Meta AI is building a suite of open-source transformer-based language models with millions, and even billions, of parameters. The smaller models take less energy to run, but are less adept at complex texts and tend to be not as accurate. The largest model, which has 175 billion parameters, is similar in size to other industry language models, such as GPT-3. The team also released a corresponding logbook detailing how they built and trained the models. A transformer-based model “uses both a trained mechanism for representing sequences of information and a mechanism for attention in terms of where to focus in the data. It’s trained in a self-supervised learning manner. So you hide a piece of data, and you predict it, then you reveal what it was to see if you were right or not. If it’s wrong, you back propagate through your network” to fix the error, explains Joelle Pineau, the director of Meta AI Research Labs. “It’s not taking additional context, it’s not using a knowledge graph. It’s looking at the distribution of words in a language based on the dataset on which it’s trained.” Having a good language model is an important component for chatbots, conversation agents, machine translation, and text classification, which can be used, for example, to sort customer service questions. “All of these applications can be much better if the language model you use is much richer,” says Pineau. Like Google, Meta AI is open-sourcing their language models to get feedback from other researchers, including those who study the behaviors and ethical impacts of these large AI systems. Pineau hopes that this will enable them to make systems that often work like a “black box” more transparent. At Meta AI, both the brain activity research and the creation of the language models themselves are two of the many AI-related functions that are being investigated. Other notable projects focus on areas related to perception-action, including computer vision, robotics and video. Plus, Meta is investing in a supercomputer for AI research. Though Pineau says that for now, many of these research topics remain separate from one another, it’s very likely that all of them will eventually overlap and converge in the metaverse.