Imagine an artificial intelligence (AI) model that can see and understand moving images with the delicacy of the human brain. Now, scientists at Scripps Research have made this a reality by creating MovieNet, a groundbreaking AI that processes video much like the way our brains interpret real-life scenes unfolding over time.
This brain-inspired AI model Publications of the National Academy of Sciences November 19, 2024 It can recognize moving scenes by simulating how neurons (brain cells) understand the world in real time. While traditional AI excels at still image recognition, MovieNet introduces a way to recognize complex and changing scenes through machine learning models. This is a groundbreaking technology that could transform fields from medical diagnosis to autonomous driving, where identifying subtle changes over time is critical. MovieNet is also more accurate and environmentally sustainable than traditional AI.
“The brain doesn’t just see still frames, it creates ongoing visual narratives,” says lead author Hollis Cline, Ph.D., director of the Dorris Center for Neuroscience and the Hahn Professor of Neuroscience at Scripps Research. “Static image recognition has come a long way, but the brain’s ability to process flowing scenes, like watching a movie, requires much more sophisticated forms of pattern recognition. By studying how neurons capture these sequences, we can: Similar principles apply to AI.”
To create MovieNet, Cline and first author Masaki Hiramoto, a staff scientist at Scripps Research, investigated how the brain processes real-world scenes into short sequences that resemble movie clips. Specifically, the researchers studied how tadpole neurons respond to visual stimulation.
“Tadpoles have a very good visual system, and we know that they can efficiently detect and respond to moving stimuli,” explains Hiramoto.
He and Cline identified neurons that respond to movie-like features, such as changes in brightness and image rotation, and that can recognize objects when they move and change. These neurons, located in the visual processing area of the brain known as the optic nerve, assemble parts of moving images into a coherent order.
Think of this process as similar to a lenticular puzzle. Each piece may not make sense on its own, but together they form a complete moving image. Different neurons process different “puzzle pieces” of a real-world moving image, which the brain integrates into a continuous scene.
The researchers also found that the tadpoles’ optic nerves distinguish subtle changes in visual stimuli over time, capturing information in dynamic clips of about 100 to 600 milliseconds rather than still frames. These neurons are very sensitive to patterns of light and shadow, and each neuron’s response to a specific part of the visual field helps construct a detailed map of the scene, forming a “movie clip.”
Cline and Hiramoto trained MovieNet to emulate brain-like processing and encode video clips into a series of small, recognizable visual cues. This allowed the AI model to distinguish subtle differences between dynamic scenes.
To test MovieNet, the researchers showed video clips of tadpoles swimming under various conditions. MovieNet not only achieved an accuracy of 82.3% in distinguishing between normal and abnormal swimming behavior, but also outperformed the ability of trained human observers by approximately 18%. It outperformed existing AI models such as Google’s GoogLeNet, which achieved only 72% accuracy despite extensive training and processing resources.
“This is where we see the real potential,” Cline points out.
The team determined that MovieNet was not only better than current AI models at understanding changing scenes, but also used less data and processing time. MovieNet’s ability to simplify data without sacrificing accuracy also sets it apart from traditional AI. MovieNet decompresses visual information into essential sequences, effectively compressing the data like a compressed file, retaining important details.
Beyond its high accuracy, MovieNet is an eco-friendly AI model. Existing AI processing requires enormous amounts of energy and therefore has a significant impact on the environment. MovieNet’s reduced data requirements provide a greener alternative that saves energy while performing at a high level.
“By mimicking the brain, we significantly reduce the requirements for AI, paving the way for a model that is not only powerful but also sustainable,” says Cline. “These efficiencies also open the door to scaling AI in areas where traditional methods are costly.”
MovieNet also has the potential to reshape medicine. As technology advances, it could become a useful tool for identifying subtle changes in the early stages, such as detecting irregular heartbeats or spotting the first signs of neurodegenerative diseases such as Parkinson’s disease. For example, small motor changes associated with Parkinson’s disease that are difficult to identify with the human eye could be flagged early by AI, giving clinicians valuable time to intervene.
Additionally, MovieNet’s ability to detect changes in tadpole swimming patterns when they are exposed to chemicals could lead to more accurate drug screening technologies because scientists can study dynamic cellular responses rather than relying on static snapshots.
“Current methods are missing important changes because they can only analyze images captured at intervals,” says Hiramoto. “Observing cells over time means MovieNet can track the most subtle changes during drug testing.”
Going forward, Cline and Hiramoto plan to continue improving MovieNet’s ability to adapt to different environments, enhancing its versatility and potential applications.
“Taking inspiration from biology will continue to be a fertile area for AI advancement,” says Cline. “By designing models that think like living organisms, we can achieve levels of efficiency that are not possible with traditional approaches.”
This study “Identification of movie encoding neurons enables movie recognition AI,’ was supported by funding from the National Institutes of Health (RO1EY011261, RO1EY027437, and RO1EY031597); Hahn Family Foundation and Harold L. Dorris Neuroscience Center Endowment Fund.