This AI is constantly learning from new experiences, without forgetting its past.

Our brains are constantly learning. That new caterer is great. That gas station? Better avoid it in the future.

Memories like these physically rewire the connections in the region of the brain that supports new learning. During sleep, memories from the previous day are transferred to other parts of the brain for long-term storage, freeing up brain cells for new experiences the next day. In other words, the brain can continuously soak up our daily lives without losing access to memories of what happened before.

AI, not so much. GPT-4 and other large linguistic and multimodal models that have taken the world by storm are built using deep learning, a family of algorithms that loosely mimic the brain. The problem? “Deep learning systems with standard algorithms are slowly losing their learning ability,” Dr. Shibhansh Dohare of the University of Alberta recently said Nature.

The reason for this is the way they are configured and trained. Deep learning relies on multiple artificial neural networks connected to each other. Feeding the algorithms data (e.g. tons of online resources like blogs, news articles, and comments on YouTube and Reddit) changes the strength of these connections, so that the AI eventually “learns” patterns in the data and uses them to produce eloquent answers.

But these systems are actually brains frozen in time. Tackling a new task sometimes requires a whole new cycle of training and learning, which erases what came before and costs millions of dollars. For ChatGPT and other AI tools, this means they become increasingly obsolete over time.

This week, Dohare and his colleagues came up with a solution to the problem. The solution involves selectively resetting certain artificial neurons after a task, but without substantially changing the entire network, much like what happens in the brain while we sleep.

When tested on a continuous visual learning task (distinguishing cats from houses or distinguishing stop signs from school buses), deep learning algorithms equipped with selective resetting easily maintained high accuracy across more than 5,000 different tasks. Standard algorithms, on the other hand, quickly deteriorated, eventually coming down to a coin toss.

Called continuous backpropagation, the strategy is “one of the first in a large and growing set of methods” to address the problem of continuous learning, wrote Dr. Clare Lyle and Dr. Razvan Pascanu of Google DeepMind, who were not involved in the study.

Spirit of the machine

Deep learning is one of the most popular methods for training AI. Inspired by the brain, these algorithms feature layers of artificial neurons that connect to form artificial neural networks.

As an algorithm learns, some connections become stronger, while others become weaker. This process, called plasticity, mimics the way the brain learns and optimizes artificial neural networks so they can provide the best answer to a problem.

But deep learning algorithms are not as flexible as the brain. Once trained, their weights are locked in place. Learning a new task reconfigures the weights in existing networks, and in doing so, the AI “forgets” previous experiences. This is generally not a problem for classic uses like image recognition or language processing (as long as they can’t adapt to new data on the fly). But it becomes very problematic when training and using more sophisticated algorithms, such as those that learn and react to their environment like humans do.

Taking a classic game example, “a neural network can be trained to achieve a perfect score on the video game Pong, but training the same network to then play Space Invaders will result in a significant drop in its performance on Pong,” Lyle and Pascanu wrote.

Computer scientists have been grappling with this problem for years, which is aptly called catastrophic forgetting. One simple solution is to wipe the slate clean and retrain an AI on a new task from scratch, using a combination of old and new data. While it restores the AI’s capabilities, the nuclear option also erases all prior knowledge. And while this strategy is feasible for small AI models, it’s impractical for large models, like those that power large language models.

Save it

The new study draws on a fundamental mechanism of deep learning, a process called backpropagation. In simple terms, backpropagation provides feedback to the artificial neural network. Depending on how close the output is to the correct answer, backpropagation changes the algorithm’s internal connections until it learns the task at hand. However, with continuous learning, neural networks quickly lose their plasticity and can no longer learn.

The team took a first step towards solving the problem by using a 1959 theory with the impressive name “Selfridge’s Pandemonium.” This theory describes how we continuously process visual information and has heavily influenced AI for image recognition and other fields.

Using ImageNet, a classic repository of millions of images for training AI, the team established that standard deep learning models gradually lose their plasticity when faced with thousands of sequential tasks. These tasks are ridiculously simple for humans: telling cats from houses, for example, or school bus stop signs.

By this measure, any drop in performance means that the AI is gradually losing its ability to learn. Deep learning algorithms were accurate up to 88% of the time in previous tests. But by task 2,000, they had lost their plasticity and performance had fallen to near or below baseline.

The updated algorithm performed much better.

The algorithm still uses backpropagation, but with a small difference. A small portion of the artificial neurons are erased during training in each cycle. To avoid disrupting entire networks, only the least used artificial neurons are reset. This upgrade allowed the algorithm to handle up to 5,000 different image recognition tasks with an accuracy of over 90%.

In another proof of concept, the team used the algorithm to drive a simulated ant-like robot across multiple terrains to see how quickly it could learn and adapt with feedback.

With continuous backpropagation, the simulated creature was able to easily navigate a video game road with varying friction, such as hiking over sand, pavement, and rocks. The robot driven by the new algorithm continued to move for at least 50 million steps. Those driven by standard algorithms crashed much earlier, with performance dropping to zero about 30 percent sooner.

This study is the latest to tackle the problem of plasticity in deep learning.

A previous study showed that dormant neurons—those that no longer respond to signals from their network—make AI more rigid, and that reconfiguring them throughout training improves performance. But they’re not the only cause of this problem, Lyle and Pascanu write. AI networks that can no longer learn could also be due to interactions between networks that destabilize how AI learns. Scientists are still in the early stages of understanding this.

Meanwhile, for practical uses, when it comes to AI, “you have to keep up with the times,” Dohare said. Continuous learning isn’t just about distinguishing cats from houses. It could also help self-driving cars better navigate new streets in changing weather or lighting conditions, especially in regions with microenvironments, where fog can quickly turn into bright sunshine.

Tackling this problem “offers an exciting opportunity” that could lead to AI that can retain acquired knowledge while learning new information and, like us humans, adapt flexibly to a changing world. “These capabilities are crucial for the development of truly adaptive AI systems that can continue to train indefinitely, respond to changes in the world, and learn new skills and abilities,” Lyle and Pascanu wrote.

Photo credit: Jaredd Craig / Unsplash